**Graphic representation and the graphic analysis**

The graphic representations are used for
evident imagination of statistical quantities they allow to analyze them
deeper.

The graphic representation can
be built both after absolute and after relative quantities.

Using the graphic method, it is important to
know that the type of graphic representation must strictly answer the
maintenance of every index.

For construction of graphic representations
the following quantities are used:

__Relative ____quantities
____are: __

- intensive
indices

- extensive indices

- index of correlation

- index of evidence

__Absolute quantities__

** Intensive quantities -**
4 types of diagrams:

·
column

·
linear

·
mapgram

·
mapdiagram

** Extensive quantities:**
(they characterize the structure) sector or inwardly-column diagram.

__Indices of correlation____:__
the same diagrams, that for intensive quantities (column and linear diagrams, mapgram, mapdiagram).

** Indices of evident: **the
principles of graphic representation are the same, that for intensive
quantities.

** Column diagrams **–
for illustration of homogeneous, but not interconnected indices. They represent
the static’s of the phenomena.

__Linear diagrams__**
**– for the representations of dynamics of that or
other phenomenon (a typical example is a temperature curve, change of
birth-rate, death rate level).

** Radial
diagram **– is built on the system of polar co-ordinates
of the phenomenon representations during the close cycle of time (days, week, year). For example: structure of morbidity or cause of
mortality, where in a circle every cause of mortality, depending on its percent
occupies a certain sector.

__Mapgram__**
**is the representations of statistical quantities on
a geographical map (or scheme of card).

Absolute and other indices can be marked.

__Mapdiagram__**
**is the representations of different types of
diagrams on the geographical map.

**Common
rules of construction of graphic representations:**

·
every graphic representation must have
the name, where its contest, time and place is mentioned;

·
it must be built in a certain
scale;

·
for every graphic representation
explanation of colored application must be given (as conditional denotations or
shading).

During the
choosing of graphic representations type, it’s necessary to know that it must
strictly answer to the essence of the represented index.

Principles
of construction and application of square diagrams (linear, column,
rectangular, sector, radial).

** Linear **diagram is used for illustration of
the frequency phenomena which changes with time, that for the representations
of the phenomena dynamics.

The base of this diagram is the rectangular
system of co-ordinates. For example: on abscising axis – Х - segments are put
aside on a scale, __on a y-axis – indices of__ morbidity (х : y = 4: 3).

** Column** diagram (rectangular) is used for
illustration of homogeneous, but not connected between themselves intensive
indices. It represents dynamics or static of the phenomena.

At construction of this kind of diagram
columns are drawn, the height of which must suit the quantities of the
represented indices taking the scale into account. It is necessary to take into
account that the wideness of all the columns and also the distance between them
must be identical and arbitrary. Columns on a diagram can be vertical or
horizontal. For example: growth of number of beds in permanent establishment
from 1990 to 2003 year.

** Sector** diagram

The circle is taken as 100 % (if indices are
shown in %) thus 1 % equal to 3,6^{0}
circumferences. With the help of protractor the segments, which suit the size
of an index are put aside on a circle.

For example: among all infectious diseases a
measles had 28,6 % (28,6×3,60= 1030), and other
infections - 71,4 %
(71,4×3,60=2570).

With the help of protractor
the segments, which suit the size of every index are put aside on a circle. The
found points of circumference are connected with the center of circle. Separate
sectors in the circle are the parts of the phenomenon, which we determine.

In place of sector it is possible to use an
inwardly-column diagram. Then for 100 % the whole height of column is taken and
the extensive indices are put in the proper scale units, which give, in
essence, the whole one.

** Radial **diagram
is the type of the linear diagram built on polar co-ordinates.

At the construction of
radial diagram in the role of abscising axis - Х is the circle divided on the
identical number of parts, according to the spans of time of that or another
cycle.

A y-axis is the У- radius of circle or its
continuation.

So, for the radius of circumference the medial
quantity of time cycle is phenomenon, which we analyze is taken. The amount of
radiuses is equal to the time domains of cycle, which we study:

·
12 radiuses – at the study of the
phenomena during a year

·
7 radiuses – at the study of the
phenomena during a week.

The beginning of radiuses marking is
accepted to begin from radius, which answers to 12 hours and to continue on a hour – hand.

Results of examinations after their statistical
processing can given as graphic representations, on
which numerical numbers are presented as drawing. Schedules give a general
characteristic of the phenomenon and define its general laws, enable to analyze
the given researches more deeply.

They facilitate comparison of parameters, give
imagination about structure and character of connection between the phenomena, specify their tendencies.

Therefore, graphic demonstration we often connect
with the graphic analysis for which the graphic representation serves not only
means of demonstration of results and conclusions research, but also means of
the analysis of the received materials, revealing of internal connections and
laws.

At construction of schedules character of the data
which are subject to a graphic representation, purpose of schedules
(demonstration at conference, lectures, a reproduction in scientific work,
etc.), the purpose of the schedule (evidently to show the received results or
only to emphasize, allocate any law or the fact), a level of an audience before
which the schedule is shown are taken into account.

The choice will depend on all it is the following
as a graphic representation, color, the number, a proportion of a print, etc.
In all cases schedules should be clear, convenient and easy for reading.

In medical statistical
researches linear diagrams, plane diagrams, cartograms and linear or coordinate
are used.

LINEAR DIAGRAMS are schedules on which numerical
values are displayed by curves which allow to trace dynamics of the phenomenon
in time or to find out dependence of one attribute on another (Fig. 2.1).

**Fig.** Age mortality rate of the population
in Ukraine(Ukrainian
Center of medical statistics, Kyiv,1999)

Whether on linear diagrams with two
and a plenty of curves probably also comparison of numbers in two the greater
number of dynamic lines, and also an establishment of dependence of changes of
fluctuations which occurs in the other number line.

Linear diagrams are made
according to system of rectangular coordinates where the horizontal scale is
postponed at the left - to the right on a line of abscissas (X), and vertical -
from below - upwards on a line which is called as ordinate (Y). The obligatory
requirement of construction of any schedule is scale, that is the image on
drawing should be reduced, compared with corresponding figures.

Contrast to linear diagrams which describe dynamics
of any process, plane diagrams are used in the case when it is necessary to
represent the statistical phenomena or the facts, independent one from another.

The most simple example of
plane diagrams is the diagram as rectangular or figures. Digital numbers on
plane diagrams average represented by geometrical figures - rectangular,
squares. These diagrams are used for demonstration and popularization of the
resulted data, and also in cases if it is necessary to represent structure of
the phenomenon on one of the moments of supervision.

For example, age type fallen ill or structure of
disease in any settlement.

**Fig.**** **Age
structure of the population diagram (the part of each age layer was determined
to all population).

In long-pillar diagrams digital
numbers are represented by rectangular columns with an identical basis and
different height.

The height of a rectangular corresponds to the
relative value of the phenomenon which is studied. For construction a
long-pillar diagram we use a scale according to which it is possible to
determine the height of each column.

Long-pillar diagrams serve for
comparison of several sizes. It is
possible to rectangular which represent sizes, it is possible to place also on
the plane diagram not on a vertical, and across and then there will be a tape
diagram (Fig.4). In some cases the image of sizes as
tapes (stirs) is more convenient, than as columns because it is easier to sign
with each tape by a horizontal inscription.

With the aid of column and tape diagrams it is
possible not only to compare different sizes, but also simultaneously to
display structure of these numbers and to compare their parts. For example,
long-pillar or tape diagrams which show distribution of diseases on the basic nosological forms, it is possible to show also percent of
diseases among men and women.

For this purpose it is necessary (a figure or a
tape) to divide each rectangular for two parts, any of which will correspond to
digital number of disease among men and women.

In circular diagrams they use to display ratio of
homogeneous absolute sizes.

They don’t use the area of a rectangular, but the
area of a circle.

But it is necessary to remember, that the areas of
circles match up one another as squares of their radiuses, therefore at
construction of circular diagrams we must
extract off the diagram sizes and on this basis to construct radius, and
having radius, it is easy to describe a circle.

In a case if the circular
diagram displays parts of the whole, it is necessary to display circles not
separately one from another, and to impose against each other. The whole is
possible also and its parts to submit as the circle divided on sectors - the
sector diagram. At construction of the sector diagram all area of a circle is
accepted for 100 %, and each sector occupies is the following part of the area
which correspond to the necessary percent.

In practice for construction of
sector diagrams it is possible to use not only the area of a circle, but also
the area of a square and a rectangular.

Nevertheless, often it happens to divide is the
following figures are harder than a circle and consequently they are rather
seldom used as a basis of sector diagrams.

Radial or linear - circular
diagram are constructed on the basis of number coordinates in which the radius
replaces vertical scale of diagrams which are based on system of rectangular
coordinates.

The example of the radial
diagram is a wind rose with the aid of which we represent on maps the change of
a direction of a wind during any calendar period of time (month, year).

Radial diagrams are used for an
illustration of seasonal fluctuations of any numbers, for example diseases or
mortality rates.

These diagrams are constructed
on a circle which center has12 radiuses. Each radius
saws from a circle an arch in 30 (360/12=30) also represents ordinate of one of
calendar months: January, March, etc.

As an initial zero point they
take the center of a circle, and then on radiuses according to the scale chosen
before render numbers which display intensity of seasonal fluctuations of the
phenomenon in any of calendar months.

Having connected the marked
points, we receive the closed line which enables to imagine seasonal
fluctuations.

When building radial diagrams,
it is necessary to remember a rule of calculation of radiuses from the top part
of the diagram and in other words.

**Fig.**** **The radial diagram.

Seasonal prevalence of mortality rate of the
population of Kalinovsky district by Vinnitsya region (1984-1998 ,Ukraine).

Comparisons of the different phenomena according to
the territorial attribute cartograms are built, if necessary. They represent
geographical maps, on which with the aid of graphic symbols, where the
intensity of distribution and grouping of the phenomenon (morbidity, mortality,
etc.) for any period of time is shown.

Therefore they are better for building on
simplified maps on which only administrative frontiers and some big settlements
are shown. At construction of a cartogram the great value has grouping the
phenomena which are displayed.

The most simple grouping is division of some
parameters on group with parameters below average and group with parameters is
higher than average. According to this division regions
districts with parameters than will be shaded on a cartogram and below average
- not shaded.

**Fig.**** **Regional
features of mortality from cancer in Ukraine.

Graphical Representation of Data

The graphical representation of
data makes the reading more interesting, less time-consuming and easily
understandable. The disadvantage of graphical presentation is that it lacks
details and is less accurate. In our study, we have the following graphs: 1.
Bar Graphs 2. Pie Charts 3. Frequency
Polygon 4. Histogram.

Bar Graphs

This is the simplest type of
graphical presentation of data. The following types of bar graphs are possible:
(a) Simple bar graph (b) Double bar graph (c) Divided bar graph.

Pie Graph or
Pie Chart.

Sometimes a circle is used to represent a given
data. The various parts of it are proportionally represented by sectors of the
circle. Then the graph is called a Pie Graph or Pie Chart.

Bar
Graphs and Pie Charts

Bar graphs and
pie charts are commonly used to show data when the categories are qualitative.
You are probably familiar with both, but let’s review the basic ideas.

Consider the
essay grade data in Table 5.1. A bar graph would show each category with a bar
whose length corresponded to its frequency. If you make a bar graph by hand (as
opposed to with a computer), you should measure the bar lengths carefully to
make sure they correctly correspond to the frequencies. In Figure 5.3, for
example, the vertical axis is marked with frequencies centimeter apart. Thus,
the bar for A grades is 2 centimeters long, because
the frequency of A grades is 4. Note that the left side of the bar graph in
Figure 5.3 is marked with frequency, while the right side is marked with
relative frequency. As you can see, bar graphs make it easy to display both
frequencies simultaneously.

In contrast, pie
charts are used primarily for relative frequencies, because the total pie must
always represent the total relative frequency of 100%. The size of each wedge
is proportional to the relative frequency of the category it represents. Figure
5.4 shows a pie chart for the essay grade data. To make comparisons easier,
relative frequencies are often written on pie chart wedges.

Nowadays, most people make graphs
with the aid of computers that measure bar lengths or wedge sizes automatically.
However, you must still specify any labels or axis marks you want on a graph.
This labeling is extremely important: Without proper labels, a graph is
meaningless. The following summary lists the important labels for graphs. Of
course, not all labels are necessary in all cases. For example, pie charts do
not require a vertical or horizontal scale. Notice how these rules were applied
in Figure 5.3.

Frequency Polygon

In a frequency distribution, the
mid-value of each class is obtained. Then on the graph paper, the frequency is
plotted against the corresponding mid-value. These points are joined by
straight lines. These straight lines may be extended in both directions to meet
the X - axis to form a polygon.

Histogram

A two dimensional frequency
density diagram is called a histogram. A histogram is a diagram which
represents the class interval and frequency in the form of a rectangle.

In a simple bar graph, the height
of each bar represents the frequency. The thickness has no significance. All
bars to have the same thickness.

We use double bar graph when we
want to compare two things.

In the frequency polygon, the
frequency is plotted against the mid value of each class. These points are
joined by line segments.

The scientific methods of collection of data, its classification and
application to commerce and everyday life is called
statistics. A list of some important terms as follows: ungrouped data,
tabulation of data, range, frequency, frequency distribution tally, inclusive
type of grouped frequency distribution, exclusive type of grouped frequency
distribution, lower limit and actual lower limit, upper limit and actual upper
limit class size or class width class mark or class mid-interval. Variables, Continuous Variables (xv) Discrete Variables.

Graphical
Representation

There are
various methods of graphical representation of statistical data. In our study,
we learn two types. Histogram Ogive
or Cumulative Frequency Curve.

Cumulative
Frequency

Cumulative
frequency is obtained by adding the frequency of a class interval and the
frequencies of the preceding intervals up to that class interval.

Cumulative
Frequency Curve

A plot of the cumulative frequency against the upper class boundary with
the points joined by line segments. Any continuous cumulative frequency
curve, including a cumulative frequency polygon, is called an ogive. There are two ways of constructing an ogive or cumulative frequency curve. The curve is usually
of shape.

A histogram is a diagram which represents the
class interval and frequency in the form of a rectangle. The cumulative
frequency curve is a shaped curve. Points on the cumulative frequency curve
have abscissas as the actual upper / lower limits for 'less than' / more than
curve and ordinates as the cumulative frequencies.

GRAPHICAL REPRESENTATION OF DATA

Graphical representation is done of the data available this being a very
important step of statistical analysis. We will be discussing the organization
of data. The word 'Data' is plural for 'datum'; datum means facts.
Statistically the term is used for numerical facts such as measures of height,
weight and scores on achievement and intelligence tests.

Tests, experiments
and surveys in education and psychology provide us valuable data, mostly in the
shape of numerical scores. For understanding data available and deriving
meaning and useful conclusion, the data have to be organized or arranged in
some systematic way. This can be done by following ways:

1. Statistical tables

2. Rank order

3. Frequency distribution

Statistical
tables

The data are
tabulated or arranged into rows and columns of different heading. Such tables
can list original raw scores as well as the percentages, means, standard
deviations and so on.

Rules
for constructing tables:

1. Title of the table should be simple, concise and unambiguous. As a
rule, it should appear on the table.

2. The table should be suitably divided into columns and rows according
to the nature of data and purpose. These columns and rows should be arranged in
a logical order to facilitate comparison.

3. The heading of each columns or row should be as brief as possible. Two
or more columns or rows with similar headings may be grouped under a common
heading to avoid repetition and we may have subheadings or captions.

4. Sub total for each separate classification
and a general total for all combined classes are to be given. These totals
should be given at the bottom or right of the concerned items.

5. The units in which the data are given must invariably be mentioned.

6. Necessary footnotes should be providing essential explanation of the
points to ambiguous representation of the tabulated data must be given at the
bottom of the table.

7. The sources from where the data have been received should be given at
the end of the table.

9. If the numbers tabulated have more than three significant figures, the
digit should be grouped in threes. For ex.- 4394756 as
4 394 756.

10. For all purposes and by all means, the table should be as simple as
possible so that it may be studied by the readers with minimum possible strain
and create a clear picture and interpretations of the data.

Rank
order

The original
raw scores can be arranged in an ascending or a descending series exhibiting an
order with respect to the rank or merit position of the individual. Example:

Sixteen
students of BA final psychology class obtained the following scores on an
achievement test. Tabulating the given data -

5 8 4 12 15 17
18 12 20 7 8 19 6 9 10 11

S. No. Scores S No. Scores S No. Scores S No. Scores

1 20 5 15 9 10
13 7

2 19 6 12 10 9
14 6

3 18 7 12 11 8
15 5

4 17 8 11 12 8
16 4

Frequency
Distribution

The organization of the data according to rank order does not help us to
summarize a series of raw scores. It also does not tell us the frequency of the
raw scores. In frequency distribution we group the data into an arbitrarily
chosen groups or classes. It is also seen that how many times a particular
score or group of scores occurs in the given data. This is known as the
frequency distribution of numerical data.

Construction
of Frequency distribution table

Finding the
range:

First of all
the range of the series to be grouped is found. it is
done by subtracting the lowest score from the highest. In the present problem
the range of the distribution is 46-12, 34.

Determining
class interval:

After finding
range we find class interval represented by Y. The formula for this is:

Writing the
contents of the frequency distribution table:

Writing the classes of the distribution.

In the first column
we write the classes of distribution. First of all the lowest class is settled
and afterwards other subsequent classes are written down. In this case we take
10-14 as the lowest class, then wee have higher
classes as 15-19, 20-24,.. and
so on up to 45-49.

Tallying the scores into proper classes.

The scores
given are tallied into proper classes in the second column then the tallies are
counted against each class to obtain the frequency of the class.

GRAPHICAL
REPRESENTATION OF DATA

The statistical data may be presented in a more attractive form appealing
to the eye with the help of some graphic aids, i.e. Pictures and graphs. Such
presentation carries a lot of communication power. A mere glimpse of thee picture and graphs may enable the viewer to have an
immediate and meaningful grasp of the large amount of data.

Ungrouped
data may be represented through a bar diagram, pie diagram, pictograph and line
graph.

Bar graph represents the data on the graph paper in the form of vertical
or horizontal bars.

In a pie diagram, the data is represented by a circle of 360degrees into parts, each representing the amount of data
converted into angles. The total frequency value is equated to 360 degrees and
then the angle corresponding to component parts are calculated.

In pictograms, the data is represented by means of picture figures
appropriately designed in proportion to the numerical data.

Line graphs represent the data concerning one variable on the horizontal
and other variable on the vertical axis of the graph paper.

Grouped data may be represented graphically by histogram, frequency
polygon, cumulative frequency graph and cumulative frequency percentage curve
or ogive.

A histogram is
essentially a bar graph of a frequency distribution. The actual class limits
plotted on the x-axis represents the width of various bars and respective
frequencies of these class intervals represent the height of these bars.

A frequency
polygon is a line graph for the graphical representation of frequency
distribution.

A cumulative
frequency graph represents the cumulative frequency distribution by plotting
actual upper limits of the class intervals on the x axis and the respective
cumulative frequencies of these class intervals on the y axis.

Cumulative
frequency percentage curve or ogive represents
cumulative percentage frequency distribution by plotting upper limits of the
class intervals on the x axis and the respective cumulative percentage
frequencies of these class intervals on the y axis.

METHOD
FOR CONSTRUCTING

A HISTOGRAM

1. The scores in the form of actual class limits as 19.5-24.5, 24.5-29.5
and so on are taken as examples in the construction of a histogram rather than
written class limits as 20-24, 25-30.

2. It is customary to take two extra intervals of classes one below and
above the grouped intervals.

3. Now we take the actual lower limits of all the class intervals and try
to plot them on the x axis. The lower limit of the lowest class interval is
taken at the intersecting point of x axis and y axis.

4. Frequencies of the distribution are plotted on the y axis.

5. Each class interval with its specific frequency is represented by
separate rectangle. The base of each rectangle is the width of the class
interval. And the height is representative of the frequency of that class or
interval.

6. Care should
be taken to select the appropriate units of representation along the x and y
axis. Both the axis and the y axis must not be too short or too long.

For
quantitative data categories, the two most common types of graphics are *histograms
*and *line charts. *Figure 5.9a shows a
histogram for the binned exam data of Table 5.3. Figure 5.9b
shows a line chart for the same data.

A **histogram **is
essentially a bar graph in which the data categories are quantitative. Thus,
the bars on a histogram must follow the natural order of the numerical
categories. In addition, the widths of histogram bars have a specific meaning.
For example, the width of each bar in Figure 5.9a
represents 5 points on the exam. Because there are no gaps between the
categories, the bars on a histogram touch each other.

A **line chart
**serves the same basic purpose as a histogram, but instead of using bars, a
line chart connects a series of dots. When data are binned, the dot is placed
at the center of each bin. Histograms and line charts are often used to show
how some variable changes with time. For example, the line chart in Figure 5.10
shows how the U.S. homicide rate has changed with time. The categories are time
intervals. In this case, each bin represents a year in the data. Histograms and
line charts with time on the horizontal axis are often called **time-series
diagrams.**

METHOD
FOR CONSTRUTING A FREQUENCY POLYGON

1. As in histogram two extra class interval is taken, one above and other
below the given class interval.

2. The mid-points of the class interval is
calculated.

3. The mid point is calculated along the x axis
and the corresponding frequencies are plotted along the y axis.

4. The various points given by the plotting are joined by lines to give
frequency polygon.

DIFFERENCE
BETWEEN HISTOGRAM AND FRQUENCY POLYGON

Histogram is a bar graph while frequency polygon is a line graph.
Frequency polygon is more useful and practical. In frequency polygon it is easy
to know the trends of the distribution; we are unable to do so in histogram.
Histogram gives a very clear and accurate picture of the relative proportion of
the frequency from interval to interval.

METHOD
FOR CONSTRUTING

A
CUMULATIVE FREQUENCY GRAPH

1. First of all we calculate the actual upper and lower limits of the
class intervals i.e. if the class interval is 20-24 then upper limit is 24.5
and the lower limit is 19.5.

2. We must know select a suitable scale as per the range of the class
interval and plot the actual upper limits on the x axis and the respective
cumulative frequency on y axis.

3. All the plotted points are then joined by successive straight lines resulting a line graph.

4. To plot the origin of the x axis an extra class interval is taken with
cumulative frequency zero is taken.

Statistics is that branch of
mathematics devoted to the collection, compilation, display, and interpretation
of numerical data. In general, the field can be divided into two major subgroups,
descriptive statistics and inferential statistics. The former subject deals
primarily with the accumulation and presentation of numerical data, while the
latter focuses on predictions that can be made based on those data.

Perhaps the
simplest way to report the results of the study described above is to make a
table. The advantage of constructing a table of data is that a reader can get a
general idea about the findings of the study in a brief glance.

Two fundamental concepts used in statistical analysis are population and
sample. The term population refers to a complete set of individuals, objects,
or events that belong to some category. For example, all of the players who are
employed by Major League Baseball teams make up the population of professional
major league baseball players. The term sample refers to some subset of a
population.

Statistics
- Collecting Data

Statistics
- Graphical Representation

The table
shown above is one way of representing the frequency distribution of a sample
or population. A frequency distribution is any method for summarizing data that
shows the number of individuals or individual cases present in each given
interval of measurement. In the table above, there are 5,382,025 female
African-Americans in the age group 0-19;

Statistics
- Distribution Curves

Finally, think
of a histogram in which the vertical bars are very narrow...and then very, very
narrow. As one connects the midpoints of these bars, the frequency polygon
begins to look like a smooth curve, perhaps like a high, smoothly shaped hill.
A curve of this kind is known as a distribution curve. Probably the most
familiar kind of distribution curve is one with a peak in the middle.

Statistics

Other
Kinds Of Frequency Distributions

Bar graphs look very much like histograms except that gaps are left
between adjacent bars. This difference is based on the fact that bar graphs are
usually used to represent discrete data and the space between bars is a
reminder of the discrete character of the data represented. Line graphs can
also be used to represent continuous data. If one were to record the
temperature once an hour all day to week.

Statistics
- Measures Of Central Tendency

Both
statisticians and non-statisticians talk about "averages" all the
time. But the term average can have a number of different meanings. In the
field of statistics, therefore, workers prefer to use the term "measure of
central tendency" for the concept of an "average." One way to understand how various measures of central tendency.

Measures
Of Variability

Suppose that a
teacher gave the same test to two different classes and obtained the following
results: Class 1: 80%, 80%, 80%, 80%, 80% Class 2: 60%, 70%, 80%, 90%, 100% If
you calculate the mean for both sets of scores, you get the same answer: 80%.
But the collection of scores from which this mean was obtained was very
different in the two cases. The way that statisticians have of distinguishing…

Statistics
- Inferential Statistics

Expressing a collection of data in some useful form, as described above,
is often only the first step in a statistician's work. The next step will be to
decide what conclusions, predictions, and other statements, if any, can be made
based on those data. A number of sophisticated mathematical techniques have now
been developed to make these judgments. An important fundamental concept used
in biostatistics.

Computer forensics is the preservation,
analysis, and interpretation of computer data. There is a need for software
that aids investigators in locating data on hard drives left by persons
committing illegal activities. These software tools should reduce the tedious
efforts of forensic examiners, especially when searching large hard drives. A
method is proposed here that uses visualization techniques to represent file
statistics, such as file size, last access date, creation date, last
modification date, owner, and file type. The user interface to this software
allows file searching, pattern matching, and display of file contents. By
viewing file information graphically, the developed software will reduce the
examiner’s analysis time and greatly increase the probability of locating
criminal evidence.

Computer forensics is the
preservation, analysis, and interpretation of computer data. In a world wherein
the number of crimes committed using computers is increasing rapidly, a
definite need exists for forensic software tools. These tools allow
investigators to follow digital tracks left by persons committing illegal
activities. Traces of evidence may be found in plain text documents, log files,
or even system files, yet more technologically advanced criminals may conceal
information by deleting it, encrypting it, or embedding it inside another file.
With the large amount of storage space available on modern hard drives,
searching for a single file becomes quite tedious without the help of special
forensic tools. Using visualization techniques to display information about
computer data can help forensic specialists direct their search to suspicious
files.

A great deal of time is wasted trying
to interpret mass amounts of data that is not correlated or meaningful without
high levels of patience and tolerance for error. A well quoted phrase, “a
picture is worth a thousand words,” is what we’re trying to accomplish here.
Human brains have the ability to interpret and comprehend pictures, video, and
charts much faster than reading a description of the same. This is because the
human mind is able to examine graphics in parallel but only examine text in
serial. Imagine a friend trying to describe in an email the beauty of the
Shenandoah Valley using the best vocabulary he has. It takes some time because
there are so many elements to convey without misrepresenting the Shenandoah
Valley as another valley full of green trees. Eventually, the friend decides it
is best just to show a picture taken from a scenic overlook. You are amazed at
the beauty and realize it would have taken thousands of words to describe it
all.

One single picture not only presented
an accurate representation of the Shenandoah Valley but saved you reading a
very long email. Using this concept of visual perception, we have developed a
graphical user interface (GUI) that displays file information visually. The
user is able to query a specific directory to query and see statistics, such as
file size, access date, creation date, modification date, owner, and file type,
represented by pixel intensity or colour, wherein
each pixel represents a file.

Requests for more information about a
suspect file can be filled by clicking on the display and walking through
various menus. Viewing information about multiple files or understanding the
relationship between them is also helpful. The user interface to this software
allows file searching, pattern matching, and display of file contents. Each of
these options allows a deeper analysis of the data stored on the hard drive and
results in a flexible and customizable tool for locating criminal evidence.

The software tool we have developed
will greatly aid the computer forensic process by reducing the time to identify
suspicious files and increasing the probability of locating criminal evidence.
This is done by using a graphical representation of the file rather than
traditional text.

Our contributions to computer science
include the use of enhanced tree-maps, applied visualization techniques for
computer forensics, and a software framework on which to build future
enhancements. Enhanced tree-maps help represent temporal information about files,
such as access time. Traditional tree-maps only have the capability of
representing spatial information, such as size. The first to apply
visualization techniques to computer forensics and will show it to be a
promising method for identifying hidden or altered files. Lastly, our software
allows for additional visualization techniques not yet developed.

Documentation

During the analysis process, detailed
information must be recorded if there is to be any hope of a successful court
appearance. This information includes forensic tools used, actions taken, and
chain of custody. Some forensic tools have more credibility in court than
others because they have been proven. Thus, it is important to use a proven
forensic tool. Actions taken include opening files and hashing. Time of day
should be recorded whenever a file is opened, hashed, or scanned, along with
the directory it was discovered in. Every examiner involved in the case needs
to be recorded in the chain of custody. At any time in the investigation, it
should be clear and possible to identify the individual who carried out an
analysis task.

Court Appearance

Once the evidence has been analyzed,
authenticated, and documented, it may go to court. It is important to present
the case in a simple and clear manner because judges and juries may not have
technical knowledge of computer systems. Investigators who have followed the
forensic process will have a higher probability of winning the case. However,
if there are holes in the chain of custody or any step of the forensic process,
the defence will exploit them and usually succeed at
convincing the jury the investigation was handled improperly.

The prosecution, thus, would not be
able to rebuild their case and would loose. An
understanding of the computer forensic process leads to the development of
improved software that aids investigators in locating evidence. Any software
used to collect or analyze evidence must follow the computer forensic
guidelines; otherwise, its use becomes a hindrance rather than a benefit.

Visualization of Data

Tree-Maps

In our method, we use visualization
techniques to help represent file attributes. One method of displaying the
relationship of files visually in two dimensions is called a tree map. Schneiderman describes tree-maps as 2D
space-filling algorithm for complex tree structures. They are designed to
display the entire tree structure in one screen. Each file is represented by a
shaded box that adheres to a chosen colouring scheme
that highlights file and directory boundaries. Box size is determined by two
parameters: the size of the user selected display region and percentage of the
selected directory the file occupies. Other file directory representations like
that of Windows Explorer use nodes and edges rotated on their side and always
require scrolling up and down to view the complex structure.

The tree-map facilitates easy
recognition of the largest files because they take up the most space in the 2D display. The method of using tree-maps to visualize data
storage and directory structure greatly reduces the time it takes to locate
large files in a tree structure that is nine levels deep and contains many
thousands of files. Tree-maps are primarily designed to emphasize large files.
However, Schneiderman does point out that a user can
drag a mouse over the display and click on a shaded box to query the system for
the file name or other information. Such additions may enhance the usefulness
of tree-maps, but stand-alone tree-maps for computer forensics contain many
weaknesses. Small files and directories are hidden among larger files and may
not even show up on the display. We may be looking for a simple file on a
massive hard drive. If the file is small or if the disk contains numerous
files, our file will hardly stand out.

For our purposes, stand-alone
tree-maps require enhancement that provides the user with advanced filtering
and display techniques. In this way, tree-maps are interesting and provide
groundwork for opportunities in computer forensics

"Graphic
representation of statistics" Videos

Graphic representation of statistics
Questions & Answers

Question: NAME THE DIFFERENT
GRAPHICAL REPRESENTATION OF DATA USE IN STATISTICS

Answer: graphical representation of
statistical data is for the sole purpose of easier interpretation. in modern
manufacturing it has been converted to 'statistical process control' which
sprung the 'seven QC tools' and was recently upgraded the seven QC tools: flow
charts run charts paretic diagram histogram cause effect diagrams scatter
diagrams control chart (the most famous and widely used) the new version:
affinity diagrams relations diagrams tree diagram matrix diagram arrow diagram process
decision program charts matrix data analysis just type any of the key words i put in here in your search engines and you'll have better
explanations about them good luck

Question: Based on your observation
list out 4 points on the characteristics of logarithmic or exponential
functions and their graphical representation.

Answer: they are mirror images of
each other. That’s one.

Question: Working with Numbers Number
Operations and Number Sense Simple Algebra Algebra,
Functions, and Patterns Geometry and Graphing Measurement, Geometry and
Coordinate Geometry or lead me to a site. 4. Statistical Math Data analysis,
reading graphical representations of data Statistics and probability

Answer: I'm not trying to just get
points, but no one can help you with this. you have to
have real problems, because these subjects are so broad that it would be
impossible to cover these even simply without talking an hour.

Results of examinations after their statistical
processing can given as graphic representations, on which
numerical numbers are presented as drawing. Schedules give a general
characteristic of the phenomenon and define its general laws, enable to analyze
the given researches more deeply.

They facilitate comparison of parameters, give
imagination about structure and character of connection between the phenomena, specify their tendencies.

Therefore, graphic demonstration we often connect
with the graphic analysis for which the graphic representation serves not only
means of demonstration of results and conclusions research, but also means of
the analysis of the received materials, revealing of internal connections and
laws.

At construction of schedules character of the data
which are subject to a graphic representation, purpose of schedules
(demonstration at conference, lectures, a reproduction in scientific work,
etc.), the purpose of the schedule (evidently to show the received results or
only to emphasize, allocate any law or the fact), a level of an audience before
which the schedule is shown are taken into account.

The choice will depend on all it is the following
as a graphic representation, color, the number, a proportion of a print, etc.
In all cases schedules should be clear, convenient and easy for reading.

In medical statistical
researches linear diagrams, plane diagrams, cartograms and linear or coordinate
are used.

LINEAR DIAGRAMS
are schedules on which numerical values are displayed by curves which allow to
trace dynamics of the phenomenon in time or to find out dependence of one
attribute on another.

Whether on linear diagrams with
two and a plenty of curves probably also comparison of numbers in two the
greater number of dynamic lines, and also an establishment of dependence of
changes of fluctuations which occurs in the other number line.

Linear diagrams are made according to system of
rectangular coordinates where the horizontal scale is postponed at the left –
to the right on a line of abscissas (X), and vertical – from below – upwards on
a line which is called as ordinate (Y). The obligatory requirement of
construction of any schedule is scale, that is the image on drawing should be
reduced, compared with corresponding figures.

Contrast to linear diagrams
which describe dynamics of any process, plane diagrams are used in the case
when it is necessary to represent the statistical phenomena or the facts,
independent one from another.

The most simple example of
plane diagrams is the diagram as rectangular or figures. Digital numbers on
plane diagrams average represented by geometrical figures – rectangular,
squares. These diagrams are used for demonstration and popularization of the
resulted data, and also in cases if it is necessary to represent structure of
the phenomenon on one of the moments of supervision.

For example, age type fallen ill or structure of
disease in any settlement.

**Fig.**** **Age structure
of the population (the part of each age layer was determined to all
population).

In long-pillar diagrams digital numbers are
represented by rectangular columns with an identical basis and different
height.

The height of a rectangular corresponds to the
relative value of the phenomenon which is studied. For construction a
long-pillar diagram we use a scale according to which it is possible to
determine the height of each column.

Long-pillar diagrams serve for comparison of
several sizes. It is possible to rectangular which represent sizes, it is
possible to place also on the plane diagram not on a vertical, and across and
then there will be a tape diagram (Fig.4). In some
cases the image of sizes as tapes (stirs) is more convenient, than as columns
because it is easier to sign with each tape by a horizontal inscription.

With the aid of column and
tape diagrams it is possible not only to compare different sizes, but also
simultaneously to display structure of these numbers and to compare their
parts. For example, long-pillar or tape diagrams which show distribution of
diseases on the basic nosological forms, it is
possible to show also percent of diseases among men and women.

For this purpose it is necessary (a figure or a
tape) to divide each rectangular for two parts, any of which will correspond to
digital number of disease among men and women.

In circular diagrams they use to display ratio of
homogeneous absolute sizes.

They don’t use the area of a rectangular, but the
area of a circle.

But it is necessary to remember, that the areas of
circles match up one another as squares of their radiuses, therefore at
construction of circular diagrams we must extract off the diagram sizes and on
this basis to construct radius, and having radius, it is easy to describe a
circle.

In a case if the circular diagram displays parts of
the whole, it is necessary to display circles not separately one from another,
and to impose against each other. The whole is possible also and its parts to
submit as the circle divided on sectors – the sector diagram. At construction
of the sector diagram all area of a circle is accepted for 100 %, and each
sector occupies is the following part of the area which correspond to the
necessary percent.

In practice for construction of sector diagrams it
is possible to use not only the area of a circle, but also the area of a square
and a rectangular.

Nevertheless, often it happens to divide is the
following figures is more hard, than a circle and
consequently they are rather seldom used as a basis of sector diagrams.

Radial or linear – circular diagram are constructed
on the basis of number coordinates in which the radius replaces vertical scale
of diagrams which are based on system of rectangular coordinates.

The example of the radial diagram is a wind rose
with the aid of which we represent on maps the change of a direction of a wind
during any calendar period of time (month, year).

Radial diagrams are used for an illustration of
seasonal fluctuations of any numbers, for example diseases or mortality rates.

These diagrams are constructed on a circle which
center has 12 radiuses. Each radius saws from a circle an arch in 30
(360/12=30) also represents ordinate of one of calendar months: January, March,
etc.

As an initial zero point they
take the center of a circle, and then on radiuses according to the scale chosen
before render numbers which display intensity of seasonal fluctuations of the
phenomenon in any of calendar months.

Having connected the marked points, we receive the
closed line which enables to imagine seasonal fluctuations.

When building radial diagrams, it is necessary to
remember a rule of calculation of radiuses from the top part of the diagram and
in other words.

**Fig.**** **The
radial diagram.

Comparisons of the different phenomena according to a territorial attribute cartograms are built, if
necessary. They represent geographical maps, on which with the aid of graphic
symbols where the intensity of distribution and grouping of the phenomenon ( morbidity, mortality, etc.) for any period of time ( Fig.
2.4) is shown.

Therefore they are better for building on
simplified maps on which only administrative frontiers and some big settlements
are shown. At construction of a cartogram the great value has grouping the
phenomena which are displayed.

The most simple grouping is division of some
parameters on group with parameters below average and group with parameters is
higher than average. According to this division regions
districts with parameters than will be shaded on a cartogram and below average
– not shaded.

**Graphics in the Media**

Now that we’ve
discussed basic types of statistical graphs, we are ready to explore some of
the fancier graphics that appear daily in the news. We will also discuss
several cautions to keep in mind when interpreting media graphics.

Graphics Beyond the Basics

Many graphical
displays of data go beyond the basic types. Here, we explore a few of the types
that are most common in the news media

**Multiple Bar Graphs **

A **multiple bar graph **is
a simple extension of a regular bar graph. It has two or more sets of bars that
allow comparison between two or more data sets. All the data sets must involve
the same categories so that they can be displayed on the same graph. For
example, Figure 5.15 is a multiple bar graph showing trends in home computing.
The categories are years. The two sets of bars represent two different measures
of home computing: ownership of personal computers and connection to the
Internet. Note that a legend clearly identifies the two sets of bars.

**EXAMPLE 1 ***Computing
Trends *

Summarize two major
trends shown in Figure 5.15.

**SOLUTION **The most
obvious trend is that both data sets show an increase with time. That is, the
number of homes with computers and the number of online homes both increased
with time. We see a second trend by comparing the bars within each year. In 1995, the
number of online homes (about 10 million) was less than one-third the number of homes with computers (about 33 million). By 2003,
the number of online homes (about 62 million) was about 90% of the number of
homes with computers (about 70 million). This tells us that a higher percentage
of computer users are going online.

**Stack Plots **

Another common type of
graph, called a **stack plot, **shows different data sets in a vertical
stack. Figure 5.16 uses a stack plot to show trends in death rates (deaths per
100,000 people) for four diseases since 1900. Each disease has its own
color-coded region, or wedge; note the importance of the legend. The *thickness
*of a wedge at a particular time tells you its value at that time: When a
wedge is thick it has a large value, and when it is thin it has a small value.

**EXAMPLE 2 ***Stack Plot *

Based on Figure 5.16,
what was the death rate for cardiovascular disease in 1980? Discuss the general
trends visible on this graph.

**SOLUTION **For 1980, the
cardiovascular wedge extends from about 180 to 620 on the vertical axis, so its
thickness is about 440. Thus, the death rate in 1980 for cardiovascular disease
was about 440 deaths per 100,000 people. The graph shows several important
trends. First, the downward slope of the top wedge shows that the overall death
rate from these four diseases decreased substantially, from nearly 800 deaths
per 100,000 in 1900 to about 525 in 2003. The drastic decline in the thickness
of the tuberculosis wedge shows that this disease was once a major killer, but
has been nearly wiped out since 1950. Meanwhile, the cancer wedge shows that the death
rate from cancer rose steadily until the mid-1990s,
but has dropped somewhat since then.

**Graphs of Geographical
Data **

We are often interested in
geographical patterns in data. Figure 5.17 shows one common way of displaying
geographical data. In this case, the data on per capita (per person) income are
shown state by state. The legend explains that different colors represent
different income levels. Similar colors are used for similar income levels.
Thus, it is easy to see that income levels tend to be highest in the northeast
and lowest in the south.

The display in Figure 5.17 works well
because each state is associated with a unique income level. For data that vary
continuously across geographical areas, a **contour map **is more
convenient. Figure 5.18 shows a contour map of temperature over the United
States at a particular time. Each of the *contours *connects locations
with the same temperature. For example, the temperature is 50°F
everywhere along the contour labeled 50° and 60°F
everywhere along the contour labeled 60°F. Between
these two contours, the temperature is between 50°F
and 60°F. Note that in regions where contours are
tightly spaced, there are greater temperature changes. For example, the closely
packed contours in the northeast indicate that the temperature varies
substantially over small distances. To make the graph easier to read, the
regions between adjacent contours are color-coded.

**Three-Dimensional Graphics
**

Today, computer software
makes it easy to give almost any graph a three-dimensional appearance. For
example, Figure 5.19 shows the bar graph of Figure 5.3, but “dressed up” with a
three-dimensional look. It may look nice, but the three-dimensional effects are
purely cosmetic. They don’t provide any information that wasn’t already in the
two-dimensional graph in Figure 5.3. As this example shows, many “three-dimensional”
graphics really only make two-dimensional data look a little fancier. In
contrast, each of the three axes in Figure 5.20 carries distinct information,
making it a true three-dimensional graph. Researchers studying migration
patterns of a bird species (the *Bobolink*) counted the number of birds
flying over seven New York cities throughout the night. As shown on the inset
map, the cities were aligned eastwest so that the
researchers would learn what parts of the state the birds flew over, and at what
times of night, as they headed south for the winter. Thus, the three axes
measure *number of birds, time of night, *and *east-west location.*

**References:**

1.
David Machin. Medical statistics: a
textbook for the health sciences / David Machin,
Michael J. Campbell, Stephen J Walters. – John Wiley & Sons, Ltd., 2007. –
346 p.

2. Nathan
Tintle. Introduction
to statistical investigations / Nathan Tintle,
Beth Chance, George Cobb, Allan Rossman, Soma Roy,
Todd Swanson, Jill VanderStoep. – UCSD BIEB100, Winter 2013. –
540 p.

3. Armitage P.
Statistical Methods in Medical Research / P. Armitage,
G. Berry, J. Matthews. – Blaskwell Science, 2002. –
826 p.

4.
Larry Winner. Introduction to
Biostatistics / Larry Winner. – Department of Statistics University of Florida,
July 8, 2004. – 204 p.

5.
Weiss N. A. (Neil
A.) Elementary statistics / Neil A. Weiss; biographies by Carol A. Weiss. – 8th
ed., 2012. – 774 p.