Overview

Dataset info

Number of variables12
Number of observations891
Missing cells866 (8.1%)
Duplicate rows0 (0.0%)
Total size in memory83.7 KiB
Average record size in memory96.1 B

Variables types

Numeric5
Categorical5
Boolean1
Date0
URL0
Text (Unique)1
Rejected0
Unsupported0

Warnings

Age has 177 (19.9%) missing values Missing
Cabin has a high cardinality: 148 distinct values Warning
Cabin has 687 (77.1%) missing values Missing
Fare has 15 (1.7%) zeros Zeros
Parch has 678 (76.1%) zeros Zeros
SibSp has 608 (68.2%) zeros Zeros
Ticket has a high cardinality: 681 distinct values Warning

Variables

Age
Numeric

Distinct count89
Unique (%)10.0%
Missing (%)19.9%
Missing (n)177
Infinite (%)0.0%
Infinite (n)0
Mean29.69911765
Minimum0.42
Maximum80
Zeros (%)0.0%
Mini histogram

Quantile statistics

Minimum0.42
5-th percentile4
Q120.125
Median28
Q338
95-th percentile56
Maximum80
Range79.58
Interquartile range17.875

Descriptive statistics

Standard deviation14.52649733
Coef of variation0.4891221855
Kurtosis0.1782741536
Mean29.69911765
MAD11.32294447
Skewness0.3891077823
Sum21205.17
Variance211.0191247
Memory size7.1 KiB
Histogram
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
24 30 3.4%
 
22 27 3.0%
 
18 26 2.9%
 
28 25 2.8%
 
19 25 2.8%
 
30 25 2.8%
 
21 24 2.7%
 
25 23 2.6%
 
36 22 2.5%
 
29 20 2.2%
 
Other values (78) 467 52.4%
 
(Missing) 177 19.9%
 

Minimum 5 values

ValueCountFrequency (%) 
0.42 1 0.1%
 
0.67 1 0.1%
 
0.75 2 0.2%
 
0.83 2 0.2%
 
0.92 1 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
80 1 0.1%
 
74 1 0.1%
 
71 2 0.2%
 
70.5 1 0.1%
 
70 2 0.2%
 

Cabin
Categorical

Distinct count148
Unique (%)16.6%
Missing (%)77.1%
Missing (n)687
B96 B98
 
4
G6
 
4
C23 C25 C27
 
4
Other values (144)
192
(Missing)
687
ValueCountFrequency (%) 
B96 B98 4 0.4%
 
G6 4 0.4%
 
C23 C25 C27 4 0.4%
 
C22 C26 3 0.3%
 
F33 3 0.3%
 
E101 3 0.3%
 
D 3 0.3%
 
F2 3 0.3%
 
E44 2 0.2%
 
F G73 2 0.2%
 
Other values (137) 173 19.4%
 
(Missing) 687 77.1%
 
Max length15
Mean length3.134680135
Min length1
Contains charsTrue
Contains digitsTrue
Contains spacesTrue
Contains non-wordsTrue

Embarked
Categorical

Distinct count4
Unique (%)0.4%
Missing (%)0.2%
Missing (n)2
S
644
C
168
Q
 
77
(Missing)
 
2
ValueCountFrequency (%) 
S 644 72.3%
 
C 168 18.9%
 
Q 77 8.6%
 
(Missing) 2 0.2%
 
Max length3
Mean length1.004489338
Min length1
Contains charsTrue
Contains digitsFalse
Contains spacesFalse
Contains non-wordsFalse

Fare
Numeric

Distinct count248
Unique (%)27.8%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean32.20420797
Minimum0
Maximum512.3292
Zeros (%)1.7%
Mini histogram

Quantile statistics

Minimum0
5-th percentile7.225
Q17.9104
Median14.4542
Q331
95-th percentile112.07915
Maximum512.3292
Range512.3292
Interquartile range23.0896

Descriptive statistics

Standard deviation49.6934286
Coef of variation1.543072528
Kurtosis33.39814088
Mean32.20420797
MAD28.16369185
Skewness4.78731652
Sum28693.9493
Variance2469.436846
Memory size7.1 KiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[ 0. 2.00625 6.3375 7.0479 7.0521 ... 57.4896 92.2896 159.1646 262.6875 512.3292 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
8.05 43 4.8%
 
13 42 4.7%
 
7.8958 38 4.3%
 
7.75 34 3.8%
 
26 31 3.5%
 
10.5 24 2.7%
 
7.925 18 2.0%
 
7.775 16 1.8%
 
26.55 15 1.7%
 
0 15 1.7%
 
Other values (238) 615 69.0%
 

Minimum 5 values

ValueCountFrequency (%) 
0 15 1.7%
 
4.0125 1 0.1%
 
5 1 0.1%
 
6.2375 1 0.1%
 
6.4375 1 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
512.3292 3 0.3%
 
263 4 0.4%
 
262.375 2 0.2%
 
247.5208 2 0.2%
 
227.525 4 0.4%
 

Name
Categorical, Unique

First 5 values
Abbing, Mr. Anthony
Abbott, Mr. Rossmore Edward
Abbott, Mrs. Stanton (Rosa Hunt)
Abelson, Mr. Samuel
Abelson, Mrs. Samuel (Hannah Wizosky)
Last 5 values
de Mulder, Mr. Theodore
de Pelsmaeker, Mr. Alfons
del Carlo, Mr. Sebastiano
van Billiard, Mr. Austin Blyler
van Melkebeke, Mr. Philemon

First 5 values

ValueCountFrequency (%) 
Abbing, Mr. Anthony 1 0.1%
 
Abbott, Mr. Rossmore Edward 1 0.1%
 
Abbott, Mrs. Stanton (Rosa Hunt) 1 0.1%
 
Abelson, Mr. Samuel 1 0.1%
 
Abelson, Mrs. Samuel (Hannah Wizosky) 1 0.1%
 

Last 5 values

ValueCountFrequency (%) 
van Melkebeke, Mr. Philemon 1 0.1%
 
van Billiard, Mr. Austin Blyler 1 0.1%
 
del Carlo, Mr. Sebastiano 1 0.1%
 
de Pelsmaeker, Mr. Alfons 1 0.1%
 
de Mulder, Mr. Theodore 1 0.1%
 

Parch
Numeric

Distinct count7
Unique (%)0.8%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean0.3815937149
Minimum0
Maximum6
Zeros (%)76.1%
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
Median0
Q30
95-th percentile2
Maximum6
Range6
Interquartile range0

Descriptive statistics

Standard deviation0.8060572211
Coef of variation2.112344071
Kurtosis9.778125179
Mean0.3815937149
MAD0.58074195
Skewness2.749117047
Sum340
Variance0.6497282437
Memory size7.1 KiB
Histogram
Histogram with fixed size bins (bins=7)
Histogram
Histogram with variable size bins (bins=[0. 0.5 1.5 2.5 6. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 678 76.1%
 
1 118 13.2%
 
2 80 9.0%
 
5 5 0.6%
 
3 5 0.6%
 
4 4 0.4%
 
6 1 0.1%
 

Minimum 5 values

ValueCountFrequency (%) 
0 678 76.1%
 
1 118 13.2%
 
2 80 9.0%
 
3 5 0.6%
 
4 4 0.4%
 

Maximum 5 values

ValueCountFrequency (%) 
6 1 0.1%
 
5 5 0.6%
 
4 4 0.4%
 
3 5 0.6%
 
2 80 9.0%
 

PassengerId
Numeric

Distinct count891
Unique (%)100.0%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean446
Minimum1
Maximum891
Zeros (%)0.0%
Mini histogram

Quantile statistics

Minimum1
5-th percentile45.5
Q1223.5
Median446
Q3668.5
95-th percentile846.5
Maximum891
Range890
Interquartile range445

Descriptive statistics

Standard deviation257.353842
Coef of variation0.5770265516
Kurtosis-1.2
Mean446
MAD222.7497194
Skewness0
Sum397386
Variance66231
Memory size7.1 KiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[ 1. 891.], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
891 1 0.1%
 
293 1 0.1%
 
304 1 0.1%
 
303 1 0.1%
 
302 1 0.1%
 
301 1 0.1%
 
300 1 0.1%
 
299 1 0.1%
 
298 1 0.1%
 
297 1 0.1%
 
Other values (881) 881 98.9%
 

Minimum 5 values

ValueCountFrequency (%) 
1 1 0.1%
 
2 1 0.1%
 
3 1 0.1%
 
4 1 0.1%
 
5 1 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
891 1 0.1%
 
890 1 0.1%
 
889 1 0.1%
 
888 1 0.1%
 
887 1 0.1%
 

Pclass
Categorical

Distinct count3
Unique (%)0.3%
Missing (%)0.0%
Missing (n)0
3
491
1
216
2
184
ValueCountFrequency (%) 
3 491 55.1%
 
1 216 24.2%
 
2 184 20.7%
 
Max length1
Mean length1
Min length1
Contains charsFalse
Contains digitsTrue
Contains spacesFalse
Contains non-wordsFalse

Sex
Categorical

Distinct count2
Unique (%)0.2%
Missing (%)0.0%
Missing (n)0
male
577
female
314
ValueCountFrequency (%) 
male 577 64.8%
 
female 314 35.2%
 
Max length6
Mean length4.704826038
Min length4
Contains charsTrue
Contains digitsFalse
Contains spacesFalse
Contains non-wordsFalse

SibSp
Numeric

Distinct count7
Unique (%)0.8%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean0.5230078563
Minimum0
Maximum8
Zeros (%)68.2%
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
Median0
Q31
95-th percentile3
Maximum8
Range8
Interquartile range1

Descriptive statistics

Standard deviation1.102743432
Coef of variation2.108464374
Kurtosis17.88041973
Mean0.5230078563
MAD0.7137795211
Skewness3.695351727
Sum466
Variance1.216043077
Memory size7.1 KiB
Histogram
Histogram with fixed size bins (bins=7)
Histogram
Histogram with variable size bins (bins=[0. 0.5 1.5 4.5 8. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 608 68.2%
 
1 209 23.5%
 
2 28 3.1%
 
4 18 2.0%
 
3 16 1.8%
 
8 7 0.8%
 
5 5 0.6%
 

Minimum 5 values

ValueCountFrequency (%) 
0 608 68.2%
 
1 209 23.5%
 
2 28 3.1%
 
3 16 1.8%
 
4 18 2.0%
 

Maximum 5 values

ValueCountFrequency (%) 
8 7 0.8%
 
5 5 0.6%
 
4 18 2.0%
 
3 16 1.8%
 
2 28 3.1%
 

Survived
Boolean

Distinct count2
Unique (%)0.2%
Missing (%)0.0%
Missing (n)0
0
549
1
342
ValueCountFrequency (%) 
0 549 61.6%
 
1 342 38.4%
 

Ticket
Categorical

Distinct count681
Unique (%)76.4%
Missing (%)0.0%
Missing (n)0
CA. 2343
 
7
1601
 
7
347082
 
7
Other values (678)
870
ValueCountFrequency (%) 
CA. 2343 7 0.8%
 
1601 7 0.8%
 
347082 7 0.8%
 
347088 6 0.7%
 
CA 2144 6 0.7%
 
3101295 6 0.7%
 
382652 5 0.6%
 
S.O.C. 14879 5 0.6%
 
LINE 4 0.4%
 
17421 4 0.4%
 
Other values (671) 834 93.6%
 
Max length18
Mean length6.750841751
Min length3
Contains charsTrue
Contains digitsTrue
Contains spacesTrue
Contains non-wordsTrue

Correlations