You are on page 1of 7

Universidad Autónoma de Nuevo León

Facultad de Ingeniería Mecánica y Eléctrica

Semester August-June 2022

Information Theory and Coding

Fundamental Activity 1.3


Artificial Symbol Generation of the Information Source

Instructor: José Ramon Rodríguez Cruz

Students Matrícula
Alfredo de Alba Alvarado 1794507
Kenya Giselle Martinez Puente 1862460

Career: Electronic and communications engineering


Day y Hour: Tuesday_M1-M3

Date: 23-May-2022
Introduction
In this activity we will see how to generate symbols, we will analyze the statistical
validity of the procedure. Having good source models leads to more efficient
algorithms, the simpler model to assume that each generated symbol is
independent of every other, and each occurs with the same probability. You will
also see about the algorithms and about the relationship of a histogram with
relative frequencies among other things. The changes made in the MATLAB
program will also be seen, the equations made in this program will be included in
the activity. When segmenting the data series and extracting the parameters, we
are faced with a set of symbols that can be artificially generated.

A histogram shows the shape of the values, or distribution, of a continuous


variable. Histograms help you see the center, extent, and shape of a data set. They
can also be used as a visual tool to check for normality. Histograms are one of the
seven basic statistical quality control tools.

Histograms offer a good way to evaluate data. They can be used to check outliers
or outliers and help understand the distribution of your data. Understanding the
distribution of a variable is important when choosing appropriate statistical analysis
tools.
Data Generation Analysis

This is the code we made to generate the symbols in Matlab:

[y,Fs]=audioread('C:\Users\alfre\Desktop\Carpetas\music\Rocky Theme
Song.mp3');
Nm=Fs/4;
seg500=y(500*(Nm-1)+1:500*Nm);
Rcoef=xcorr(seg500,Nm*0.2,'coeff');
plot(Rcoef(length(Rcoef)/2+1:length(Rcoef)));
[a,g]=lpc(seg500,200);
grid
Ma=[];
for i=1:500
seg500=y((i-1)*Nm+1:i*Nm);
Ma(i,:)=lpc(seg500,200);
end
A=Ma(:,2:199);
for i=1:199
hist(A(:,3),12)
end
P1=(4/500); P2=(22/500); P3=(85/500); P4=(224/500); P5=(236/500);
P6=(202/500);
P7=(123/500); P8=(24/500); P9=(12/500); P10=(12/500); P11=(5/500);
P12=(6/500);
for i=1:500
u=rand(1,1);
if u<=(P1)
x(i,1)=1;
elseif u>(P1) & u<=(P1+P2)
x(i,1)=2;
elseif u>(P1+P2) & u<= (P1+P2+P3)
x(i,1)=3;
elseif u>(P1+P2+P3) & u<= (P1+P2+P3+P4)
x(i,1)=4;
elseif u>(P1+P2+P3+P4) & u<= (P1+P2+P3+P4+P5)
x(i,1)=5;
elseif u>(P1+P2+P3+P4+P5) & u<=(P1+P2+P3+P4+P5+P6)
x(i,1)=6;
elseif u>(P1+P2+P3+P4+P5+P6) & u<=(P1+P2+P3+P4+P5+P6+P7)
x(i,1)=7;
elseif u>(P1+P2+P3+P4+P5+P6+P7) & u<=(P1+P2+P3+P4+P5+P6+P7+P8)
x(i,1)=8;
elseif u>(P1+P2+P3+P4+P5+P6+P7+P8) &
u<=(P1+P2+P3+P4+P5+P6+P7+P8+P9)
x(i,1)=9;
elseif u>(P1+P2+P3+P4+P5+P6+P7+P8+P9) &
u<=(P1+P2+P3+P4+P5+P6+P7+P8+P9+P10)
x(i,1)=10;
elseif u>(P1+P2+P3+P4+P5+P6+P7+P8+P9+P10) &
u<=(P1+P2+P3+P4+P5+P6+P7+P8+P9+P10+P11)
x(i,1)=11;
elseif u>(P1+P2+P3+P4+P5+P6+P7+P8+P9+P10+P11) &
u<=(P1+P2+P3+P4+P5+P6+P7+P8+P9+P10+P11+P12)
x(i,1)=12;
end
end
hist(x,12)
n=numel(Ma)
The code made is used to generate the symbols at random, which was generated
based on activity number 2 by the 12 histogram bars, that bincenter of each
histogram bar is taken and the symbols are generated randomly. In our case, when
running the program in Matlab, it generated a total of 4200 symbols.

To apply the Chauvenet Criterion, the mean and standard deviation (typical) of the
observed information must first be calculated. Based on how much the suspect
value differs from the mean, the normal distribution function (or table thereof) is
used to determine the probability that a given datum is from the suspect value. This
probability is multiplied by the number of data in the chosen sample. If the result is
less than 0.5; doubtful data can be discarded.

Example: A reading can be rejected if the probability of obtaining the specified


deviation from the mean is less than 1/(2n). Applying as an example in our original
matrix and we take out the number of symbols, I could do a symbol reduction.

Next we will analyze and compare the two histograms, as we observe in the two
tables, some decrease and others increase, thanks to the fact that it is due to the
random generation of the symbols, it could be said that the number of samples is
altered and that is why it changes. the size of the histogram bars, if there were
even more samples the histogram would change even more.
Histogram

Previous Histogram
Conclusions

Thanks to Matlab we were able to learn to generate symbols randomly and thanks
to the fact that Matlab still makes it easy for us, only if you know how to put the
appropriate commands Matlab will do the work for us, that simply this activity will
help us to make the other activities easier since we have knowledge of it, where we
can apply compression algorithms to test our algorithm to see how efficient it is.

References

[1] J. Brownlee, «How to Decompose Time Series Data,» 6 Ffebruary 2022. [En línea]. Available:
https://machinelearningmastery.com/decompose-time-series-data-trend-seasonality/
#:~:text=Time%20series%20decomposition%20involves%20thinking,time%. [Último acceso:
2022 January 30].

[2] «influxdata,» 6 February 2022. [En línea]. Available: https://www.influxdata.com/what-is-time-


series-data/.

[3] T. Bell, «Modeling for text compression,» [En línea]. Available:


file:///C:/Users/alfre/Downloads/textmodel.pdf. [Último acceso: May 2022].

You might also like