From ba90edc72a3eedc95f2a2bd2c6923b860cc3e4d8 Mon Sep 17 00:00:00 2001 From: Olivier Hallot Date: Wed, 20 Aug 2014 21:36:38 -0300 Subject: Fix fdo#80338 Help pages for Data Statistics MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This is a text to get a mimnimum help for Calc Data Statistics. It can be improved with better english and better maths. Change-Id: I3771ed89c560df1e4e23835f5a15d8cfc47282cd Reviewed-on: https://gerrit.libreoffice.org/11051 Reviewed-by: Caolán McNamara Tested-by: Caolán McNamara --- source/text/scalc/01/stat_data.xhp | 386 ++++++++ source/text/scalc/01/statistics.xhp | 1729 ++++++++++++++++++++++++++++++----- 2 files changed, 1878 insertions(+), 237 deletions(-) create mode 100644 source/text/scalc/01/stat_data.xhp (limited to 'source') diff --git a/source/text/scalc/01/stat_data.xhp b/source/text/scalc/01/stat_data.xhp new file mode 100644 index 0000000000..5263a70ea7 --- /dev/null +++ b/source/text/scalc/01/stat_data.xhp @@ -0,0 +1,386 @@ + + + + +
+ Data + Input Range: The reference of the range of the data to analyze. + Results to: The reference of the top left cell of the range where the results will be displayed. +
+
+ Grouped By + Select whether the input data has columns or rows layout. +
+
+ Example + The following data will be used as example + + + + + + A + + + B + + + C + + + + + 1 + + + Maths + + + Physics + + + Biology + + + + + 2 + + + 47 + + + 67 + + + 33 + + + + + 3 + + + 36 + + + 68 + + + 42 + + + + + 4 + + + 40 + + + 65 + + + 44 + + + + + 5 + + + 39 + + + 64 + + + 60 + + + + + 6 + + + + + 38 + + + 43 + + + + + 7 + + + 47 + + + 84 + + + 62 + + + + + 8 + + + 29 + + + 80 + + + 51 + + + + + 9 + + + 27 + + + 49 + + + 40 + + + + + 10 + + + 57 + + + 49 + + + 12 + + + + + 11 + + + 56 + + + 33 + + + 60 + + + + + 12 + + + 57 + + + + + + + + + 13 + + + 26 + + + + + + +
+
+
+ Example + The following table has two time series, one representing an impulse function at time t=0 and the other an impulse function at time t=2. + + + + + + A + + + B + + + + + 1 + + + 1 + + + 0 + + + + + 2 + + + 0 + + + 0 + + + + + 3 + + + 0 + + + 1 + + + + + 4 + + + 0 + + + 0 + + + + + 5 + + + 0 + + + 0 + + + + + 6 + + + 0 + + + 0 + + + + + 7 + + + 0 + + + 0 + + + + + 8 + + + 0 + + + 0 + + + + + 9 + + + 0 + + + 0 + + + + + 10 + + + 0 + + + 0 + + + + + 11 + + + 0 + + + 0 + + + + + 12 + + + 0 + + + 0 + + + + + 13 + + + 0 + + + 0 + + +
+
+ +
diff --git a/source/text/scalc/01/statistics.xhp b/source/text/scalc/01/statistics.xhp index bb46695637..a1acd510ec 100644 --- a/source/text/scalc/01/statistics.xhp +++ b/source/text/scalc/01/statistics.xhp @@ -1,6 +1,4 @@ - - - - - -Data Statistics -/text/scalc/01/statistics.xhp - - - -Data Statistics -
- - Analysis toolpack;sampling - sampling;Analysis toolpack - Data statistics;sampling - - -Sampling - - (to-do) -
- - Menu Data - Statistics - Sampling... -
- - (To do) - - -Input Range -Select the input data range. - -Results to -Selects the upper left cell where the samples will be displayed. - -
- -
- - Analysis toolpack;descriptive statistics - descriptive statistics;Analysis toolpack - Data statistics;descriptive statistics - - -Descriptive Statistics - - (to-do) -
- - Menu Data - Statistics - Descriptive Statistics... -
- - (To do) - - -Input Range -Select the input data range. - -Results to -Selects the upper left cell where the samples will be displayed. - -
- -
- - Analysis toolpack;analysis of variance - analysis of variance;Analysis toolpack - Data statistics;analysis of variance - - -Analysis of Variance (ANOVA) - - (to-do) -
- - Menu Data - Statistics - Analysis of Variance (ANOVA)... -
- - (To do) - - -Input Range -Select the input data range. - -Results to -Selects the upper left cell where the samples will be displayed. - -
- -
- - Analysis toolpack;correlation - correlation;Analysis toolpack - Data statistics;correlation - - -Correlation - - Calculates the correlation of two sets of numeric data. -
- - Menu Data - Statistics - Correlation... -
-The correlation coefficient (a value between -1 and +1) means how strongly two variables are related to each other. You can use the CORREL function or the Data Statistics to find the correlation coefficient between two variables. -A correlation coefficient of +1 indicates a perfect positive correlation. -A correlation coefficient of -1 indicates a perfect negative correlation - -Input Range -Select the inout data range. - -Results to -Selects the upper left cell where the correlation will be displayed. - -Columns -The data series ae in columns. - -Rows -The data series are in rows. -
- -
- - Analysis toolpack;covariance - covariance;Analysis toolpack - Data statistics;covariance - - -Covariance - - Calculates the covariance of two sets of numeric data. -
- - Menu Data - Statistics - Covariance... -
- - Covariance indicates how two variables are related. A positive covariance means the variables are positively related, - while a negative covariance means the variables are inversely related. - - -Input Range -Select the input data range. - -Results to -Selects the upper left cell where the correlation will be displayed. - -Columns -The data series are in columns. - -Rows -The data series are in rows. -
- -
- - Analysis toolpack;exponential smoothing - exponential smoothing;Analysis toolpack - Data statistics;exponential smoothing - - -Exponential Smoothing - - (to-do) -
- - Menu Data - Statistics - Exponential Smoothing... -
- - (To do) - - -Input Range -Select the input data range. - -Results to -Selects the upper left cell where the samples will be displayed. - -
- -
- - Analysis toolpack;moving average - moving average;Analysis toolpack - Data statistics;moving average - - -Moving Average - - (to-do) -
- - Menu Data - Statistics - Moving Average... -
- - (To do) - - -Input Range -Select the input data range. - -Results to -Selects the upper left cell where the samples will be displayed. - -
- -
- - Analysis toolpack;t-test - t-test;Analysis toolpack - Data statistics;t-test - Analysis toolpack;F-test - F-test;Analysis toolpack - Data statistics;F-test - - -t-test and F-test - - (to-do) -
- - Menu Data - Statistics - t-test... - - Menu Data - Statistics - F-test... -
- - (To do) - - -Input Range -Select the input data range. - -Results to -Selects the upper left cell where the samples will be displayed. - -
- - - + + + Data Statistics in Calc + /text/scalc/01/statistics.xhp + + + + Data Statistics in Calc + Use the data statistics in Calc to perform complex data analysis + To work on a complex statistical or engineering analysis, + you can save steps and time by using Calc Data Statistics. You provide the data and parameters for each analysis, and the set of tools + uses the appropriate statistical or engineering functions to calculate and display the results in an output table. +
+ + Analysis toolpack;sampling + sampling;Analysis toolpack + Data statistics;sampling + + + Sampling + + Create a table with data sampled from another table. + +
+ + Menu Data - Statistics - Sampling... + +
+ Sampling allows you to pick data from a source table to fill a target table. The sampling can be random or in a periodic basis. + Sampling is done row-wise. That means, the sampled data will pick the whole line of the source table and copy into a line of the target table. + + Sampling Method + Random: Picks exactly Sample Size lines of the source table in a random way. + Sample size: Number of lines sampled from the source table. + Periodic: Picks lines in a pace defined by Period. + Period: the number of lines to skip periodically when sampling. + Example + The following data will be used as example of source data table for sampling: + + + + + + A + + + B + + + C + + + + + 1 + + + 11 + + + 21 + + + 31 + + + + + 2 + + + 12 + + + 22 + + + 32 + + + + + 3 + + + 13 + + + 23 + + + 33 + + + + + 4 + + + 14 + + + 24 + + + 34 + + + + + 5 + + + 15 + + + 25 + + + 35 + + + + + 6 + + + 16 + + + 26 + + + 36 + + + + + 7 + + + 17 + + + 27 + + + 37 + + + + + 8 + + + 18 + + + 28 + + + 38 + + + + + 9 + + + 19 + + + 29 + + + 39 + + +
+ Sampling with a period of 2 will result in the following table: + + + + 12 + + + 22 + + + 32 + + + + + 14 + + + 24 + + + 34 + + + + + 16 + + + 26 + + + 36 + + + + + 18 + + + 28 + + + 38 + + +
+
+
+ + Analysis toolpack;descriptive statistics + descriptive statistics;Analysis toolpack + Data statistics;descriptive statistics + + + Descriptive Statistics + + Fill a table in the spreadsheet with the main statistical properties of the data set. + +
+ + Menu Data - Statistics - Descriptive Statistics... + +
+ The Descriptive Statistics analysis tool generates a report of univariate statistics for data in the input range, providing information about the central tendency and variability of your data. + For more information, please visit the Wikipedia: http://en.wikipedia.org/wiki/Descriptive_statistics + + + + The following table displays the results of the descriptive statistics of the sample data above. + + + + + + Column 1 + + + Column 2 + + + Column 3 + + + + + Mean + + + 41.9090909091 + + + 59.7 + + + 44.7 + + + + + Standard Error + + + 3.5610380138 + + + 5.3583786934 + + + 4.7680650629 + + + + + Mode + + + 47 + + + 49 + + + 60 + + + + + Median + + + 40 + + + 64.5 + + + 43.5 + + + + + Variance + + + 139.4909090909 + + + 287.1222222222 + + + 227.3444444444 + + + + + Standard Deviation + + + 11.8106269559 + + + 16.944681237 + + + 15.0779456308 + + + + + Kurtosis + + + -1.4621677981 + + + -0.9415988746 + + + 1.418052719 + + + + + Skewness + + + 0.0152409533 + + + -0.2226426904 + + + -0.9766803373 + + + + + Range + + + 31 + + + 51 + + + 50 + + + + + Minimum + + + 26 + + + 33 + + + 12 + + + + + Maximum + + + 57 + + + 84 + + + 62 + + + + + Sum + + + 461 + + + 597 + + + 447 + + + + + Count + + + 11 + + + 10 + + + 10 + + +
+
+
+ + Analysis toolpack;analysis of variance + analysis of variance;Analysis toolpack + Data statistics;analysis of variance + + + Analysis of Variance (ANOVA) + + Produces the analysis of variance (ANOVA) of a given data set + +
+ + Menu Data - Statistics - Analysis of Variance (ANOVA)... + +
+ Analysis of variance (ANOVA) is a collection of statistical models used to analyze the differences between group means and their associated procedures (such as "variation" among and between groups). In the ANOVA setting, the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether or not the means of several groups are equal, and therefore generalizes the t-test to more than two groups. As doing multiple two-sample t-tests would result in an increased chance of committing a statistical type I error, ANOVAs are useful in comparing (testing) three or more means (groups or variables) for statistical significance. + For more information, please visit the Wikipedia: http://en.wikipedia.org/wiki/ANOVA + + + Type + Select if the analysis is for a single factor or for two factor ANOVA. + Parameters + Alpha: the level of significance of the test. + Rows per sample: Define how many rows a sample has. + + The following table displays the results of the analysis of variance (ANOVA) of the sample data above. + + + + ANOVA - Single Factor + + + + + + + + + + + + + + + Alpha + + + 0.05 + + + + + + + + + + + + + + + + + + + + + + + + + + + Groups + + + Count + + + Sum + + + Mean + + + Variance + + + + + + + Column 1 + + + 11 + + + 461 + + + 41.9090909091 + + + 139.4909090909 + + + + + + + Column 2 + + + 10 + + + 597 + + + 59.7 + + + 287.1222222222 + + + + + + + Column 3 + + + 10 + + + 447 + + + 44.7 + + + 227.3444444444 + + + + + + + + + + + + + + + + + + + + + Source of Variation + + + SS + + + df + + + MS + + + F + + + P-value + + + + + Between Groups + + + 1876.5683284457 + + + 2 + + + 938.2841642229 + + + 4.3604117704 + + + 0.0224614952 + + + + + Within Groups + + + 6025.1090909091 + + + 28 + + + 215.1824675325 + + + + + + + + + Total + + + 7901.6774193548 + + + 30 + + + + + + + + +
+
+
+ + Analysis toolpack;correlation + correlation;Analysis toolpack + Data statistics;correlation + + + Correlation + + Calculates the correlation of two sets of numeric data. + +
+ + Menu Data - Statistics - Correlation... + +
+ The correlation coefficient (a value between -1 and +1) means how strongly two variables are related to each other. You can use the CORREL function or the Data Statistics to find the correlation coefficient between two variables. + A correlation coefficient of +1 indicates a perfect positive correlation. + A correlation coefficient of -1 indicates a perfect negative correlation + For more information on statistical correlation, refer to http://en.wikipedia.org/wiki/Correlation + + + + The following table displays the results of the correlation of the sample data above. + + + + Correlations + + + Column 1 + + + Column 2 + + + Column 3 + + + + + Column 1 + + + 1 + + + + + + + + + Column 2 + + + -0.4029254917 + + + 1 + + + + + + + Column 3 + + + -0.2107642836 + + + 0.2309714048 + + + 1 + + +
+
+
+ + Analysis toolpack;covariance + covariance;Analysis toolpack + Data statistics;covariance + + + Covariance + + Calculates the covariance of two sets of numeric data. + +
+ + Menu Data - Statistics - Covariance... + +
+ In probability theory and statistics, covariance is a measure of how much two random variables change together. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the smaller values, i.e., the variables tend to show similar behavior, the covariance is positive. In the opposite case, when the greater values of one variable mainly correspond to the smaller values of the other, i.e., the variables tend to show opposite behavior, the covariance is negative. The sign of the covariance therefore shows the tendency in the linear relationship between the variables. The magnitude of the covariance is not easy to interpret. The normalized version of the covariance, the correlation coefficient, however, shows by its magnitude the strength of the linear relation. + For more information on statistical covariance, refer to http://en.wikipedia.org/wiki/Covariance + + + + The following table displays the results of the covariance of the sample data above. + + + + Covariances + + + Column 1 + + + Column 2 + + + Column 3 + + + + + Column 1 + + + 126.8099173554 + + + + + + + + + Column 2 + + + -61.4444444444 + + + 258.41 + + + + + + + Column 3 + + + -32 + + + 53.11 + + + 204.61 + + +
+
+
+ + Analysis toolpack;exponential smoothing + exponential smoothing;Analysis toolpack + Data statistics;exponential smoothing + + + Exponential Smoothing + + Results in a smoothed data series + +
+ + Menu Data - Statistics - Exponential Smoothing... + +
+ Exponential smoothing is a technique that can be applied to time series data, either to produce smoothed data for presentation, or to make forecasts. The time series data themselves are a sequence of observations. The observed phenomenon may be an essentially random process, or it may be an orderly, but noisy, process. Whereas in the simple moving average the past observations are weighted equally, exponential smoothing assigns exponentially decreasing weights over time. + Exponential smoothing is commonly applied to financial market and economic data, but it can be used with any discrete set of repeated measurements. The simplest form of exponential smoothing should be used only for data without any systematic trend or seasonal components. + For more information on exponential smoothing, refer to http://en.wikipedia.org/wiki/Exponential_smoothing + + + Parameters + Smoothing Factor: A parameter between 0 and 1 that represents the damping factor Alpha in the smoothing equation. + + The resulting smoothing is below with smoothing factor as 0.5: + + + + Alpha + + + + + + + 0.5 + + + + + + + Column 1 + + + Column 2 + + + + + 1 + + + 0 + + + + + 1 + + + 0 + + + + + 0.5 + + + 0 + + + + + 0.25 + + + 0.5 + + + + + 0.125 + + + 0.25 + + + + + 0.0625 + + + 0.125 + + + + + 0.03125 + + + 0.0625 + + + + + 0.015625 + + + 0.03125 + + + + + 0.0078125 + + + 0.015625 + + + + + 0.00390625 + + + 0.0078125 + + + + + 0.001953125 + + + 0.00390625 + + + + + 0.0009765625 + + + 0.001953125 + + + + + 0.0004882813 + + + 0.0009765625 + + + + + 0.0002441406 + + + 0.0004882813 + + +
+
+
+ + Analysis toolpack;moving average + moving average;Analysis toolpack + Data statistics;moving average + + + Moving Average + + Calculates the moving average of a time series + +
+ + Menu Data - Statistics - Moving Average... + +
+ In statistics, a moving average (rolling average or running average) is a calculation to analyze data points by creating a series of averages of different subsets of the full data set. It is also called a moving mean or rolling mean and is a type of finite impulse response filter. + You can get more details about moving average in the Wikipedia: http://en.wikipedia.org/wiki/Moving_average + + + Parameters + Interval: The number of samples used in the moving average calculation. + + + + + Column 1 + + + Column 2 + + + + + #N/A + + + #N/A + + + + + 0.3333333333 + + + 0.3333333333 + + + + + 0 + + + 0.3333333333 + + + + + 0 + + + 0.3333333333 + + + + + 0 + + + 0 + + + + + 0 + + + 0 + + + + + 0 + + + 0 + + + + + 0 + + + 0 + + + + + 0 + + + 0 + + + + + 0 + + + 0 + + + + + 0 + + + 0 + + + + + 0 + + + 0 + + + + + #N/A + + + #N/A + + +
+
+
+ + Analysis toolpack;t-test + t-test;Analysis toolpack + Data statistics;t-test + Analysis toolpack;F-test + F-test;Analysis toolpack + Data statistics;F-test + + + t-test and F-test + + Calculates the t-Test or the F-Test of two data samples. + +
+ + Menu Data - Statistics - t-test... + + + Menu Data - Statistics - F-test... + +
+ A t-test is any statistical hypothesis test in which the test statistic follows a Student's t distribution if the null hypothesis is supported. It can be used to determine if two sets of data are significantly different from each other, and is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known. When the scaling term is unknown and is replaced by an estimate based on the data, the test statistic (under certain conditions) follows a Student's t distribution. + A F-test is any statistical test in which the test statistic has an F-distribution under the null hypothesis. It is most often used when comparing statistical models that have been fitted to a data set, in order to identify the model that best fits the population from which the data were sampled. Exact "F-tests" mainly arise when the models have been fitted to the data using least squares. + For more information on t-tests, see the Wikipedia: http://en.wikipedia.org/wiki/T-test + For more information on F-tests, see the Wikipedia: http://en.wikipedia.org/wiki/F-test + Data + Variable 1 range: The reference of the range of the first data series to analyze. + Variable 2 range: The reference of the range of the second data series to analyze. + Results to: The reference of the top left cell of the range where the test will be displayed. + + + The following table shows the t-Test for the data series above: + + + + t-test + + + + + + + + + Alpha + + + 0.05 + + + + + + + + + Variable 1 + + + Variable 2 + + + + + Mean + + + 0.0769230769 + + + 0.0769230769 + + + + + Variance + + + 0.0769230769 + + + 0.0769230769 + + + + + Observations + + + 13 + + + 13 + + + + + Pearson Correlation + + + -0.0833333333 + + + + + + + Hypothesized Mean Difference + + + 2 + + + + + + + Observed Mean Difference + + + 0 + + + + + + + Variance of the Differences + + + 0.1666666667 + + + + + + + df + + + 12 + + + + + + + t Stat + + + -17.6635217327 + + + + + + + P (T<=t) one-tail + + + 2.9587510767438E-010 + + + + + + + t Critical one-tail + + + 1.7822875556 + + + + + + + P (T<=t) two-tail + + + 5.91750215348761E-010 + + + + + + + t Critical two-tail + + + 2.1788128297 + + + + +
+ Example for F-Test: + The following table shows the F-Test for the data series above: + + + + F-test + + + + + + + + + Alpha + + + 0.05 + + + + + + + + + Variable 1 + + + Variable 2 + + + + + Mean + + + 0.0769230769 + + + 0.0769230769 + + + + + Variance + + + 0.0769230769 + + + 0.0769230769 + + + + + Observations + + + 13 + + + 13 + + + + + df + + + 12 + + + 12 + + + + + F + + + 1 + + + + + + + P (F<=f) right-tail + + + 0.5 + + + + + + + F Critical right-tail + + + 2.6866371125 + + + + + + + P (F<=f) left-tail + + + 0.5 + + + + + + + F Critical left-tail + + + 0.3722125312 + + + + + + + P two-tail + + + 1 + + + + + + + F Critical two-tail + + + 0.3051313549 + + + 3.277277094 + + +
+
+
-- cgit