Overview

Dataset statistics

Number of variables12
Number of observations891
Missing cells866
Missing cells (%)8.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory315.0 KiB
Average record size in memory362.1 B

Variable types

CAT6
NUM5
BOOL1

Warnings

Ticket has a high cardinality: 681 distinct values High cardinality
Cabin has a high cardinality: 147 distinct values High cardinality
Age has 177 (19.9%) missing values Missing
Cabin has 687 (77.1%) missing values Missing
Ticket is uniformly distributed Uniform
Cabin is uniformly distributed Uniform
PassengerId has unique values Unique
Name has unique values Unique
SibSp has 608 (68.2%) zeros Zeros
Parch has 678 (76.1%) zeros Zeros
Fare has 15 (1.7%) zeros Zeros

Reproduction

Analysis started2020-10-25 20:10:13.384628
Analysis finished2020-10-25 20:10:21.142362
Duration7.76 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

PassengerId
Real number (ℝ≥0)

UNIQUE

Distinct891
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean446
Minimum1
Maximum891
Zeros0
Zeros (%)0.0%
Memory size7.1 KiB
2020-10-25T20:10:21.241797image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile45.5
Q1223.5
median446
Q3668.5
95-th percentile846.5
Maximum891
Range890
Interquartile range (IQR)445

Descriptive statistics

Standard deviation257.353842
Coefficient of variation (CV)0.5770265516
Kurtosis-1.2
Mean446
Median Absolute Deviation (MAD)223
Skewness0
Sum397386
Variance66231
MonotocityStrictly increasing
2020-10-25T20:10:21.450272image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
89110.1%
 
29310.1%
 
30410.1%
 
30310.1%
 
30210.1%
 
30110.1%
 
30010.1%
 
29910.1%
 
29810.1%
 
29710.1%
 
29610.1%
 
29510.1%
 
29410.1%
 
29210.1%
 
30610.1%
 
29110.1%
 
29010.1%
 
28910.1%
 
28810.1%
 
28710.1%
 
28610.1%
 
28510.1%
 
28410.1%
 
28310.1%
 
28210.1%
 
Other values (866)86697.2%
 
ValueCountFrequency (%) 
110.1%
 
210.1%
 
310.1%
 
410.1%
 
510.1%
 
610.1%
 
710.1%
 
810.1%
 
910.1%
 
1010.1%
 
ValueCountFrequency (%) 
89110.1%
 
89010.1%
 
88910.1%
 
88810.1%
 
88710.1%
 
88610.1%
 
88510.1%
 
88410.1%
 
88310.1%
 
88210.1%
 

Survived
Boolean

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
0
549 
1
342 
ValueCountFrequency (%) 
054961.6%
 
134238.4%
 
2020-10-25T20:10:21.600580image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pclass
Categorical

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
3
491 
1
216 
2
184 
ValueCountFrequency (%) 
349155.1%
 
121624.2%
 
218420.7%
 
2020-10-25T20:10:21.721903image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-25T20:10:21.839509image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-25T20:10:21.956563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
349155.1%
 
121624.2%
 
218420.7%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number891100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
349155.1%
 
121624.2%
 
218420.7%
 

Most occurring scripts

ValueCountFrequency (%) 
Common891100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
349155.1%
 
121624.2%
 
218420.7%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII891100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
349155.1%
 
121624.2%
 
218420.7%
 

Name
Categorical

UNIQUE

Distinct891
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
Duran y More, Miss. Asuncion
 
1
Padro y Manent, Mr. Julian
 
1
Culumovic, Mr. Jeso
 
1
Dick, Mrs. Albert Adrian (Vera Gillespie)
 
1
McGovern, Miss. Mary
 
1
Other values (886)
886 
ValueCountFrequency (%) 
Duran y More, Miss. Asuncion10.1%
 
Padro y Manent, Mr. Julian10.1%
 
Culumovic, Mr. Jeso10.1%
 
Dick, Mrs. Albert Adrian (Vera Gillespie)10.1%
 
McGovern, Miss. Mary10.1%
 
Sagesser, Mlle. Emma10.1%
 
Hippach, Mrs. Louis Albert (Ida Sophia Fischer)10.1%
 
Ponesell, Mr. Martin10.1%
 
Ohman, Miss. Velin10.1%
 
Pasic, Mr. Jakob10.1%
 
Hoyt, Mr. William Fisher10.1%
 
Vander Planke, Mrs. Julius (Emelia Maria Vandemoortele)10.1%
 
Collyer, Mr. Harvey10.1%
 
Andersen-Jensen, Miss. Carla Christine Nielsine10.1%
 
Rood, Mr. Hugh Roscoe10.1%
 
McEvoy, Mr. Michael10.1%
 
Turja, Miss. Anna Sofia10.1%
 
Natsch, Mr. Charles H10.1%
 
Harper, Mrs. Henry Sleeper (Myna Haxtun)10.1%
 
Hunt, Mr. George Henry10.1%
 
Shelley, Mrs. William (Imanita Parrish Hall)10.1%
 
Leonard, Mr. Lionel10.1%
 
Yasbeck, Mrs. Antoni (Selini Alexander)10.1%
 
Vande Walle, Mr. Nestor Cyriel10.1%
 
Murdlin, Mr. Joseph10.1%
 
Other values (866)86697.2%
 
2020-10-25T20:10:22.150319image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique891 ?
Unique (%)100.0%
2020-10-25T20:10:22.397913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length82
Median length25
Mean length26.96520763
Min length12

Overview of Unicode Properties

Unique unicode characters60
Unique unicode categories7 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
273511.4%
 
r19588.1%
 
e17037.1%
 
a16576.9%
 
i13255.5%
 
n13045.4%
 
s12975.4%
 
M11284.7%
 
l10674.4%
 
o10084.2%
 
.8923.7%
 
,8913.7%
 
t6672.8%
 
h5172.2%
 
d4852.0%
 
m3861.6%
 
u3411.4%
 
c2841.2%
 
y2511.0%
 
A2501.0%
 
g2351.0%
 
k2190.9%
 
J2150.9%
 
H2030.8%
 
S1800.7%
 
Other values (35)282811.8%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter1544664.3%
 
Uppercase Letter364515.2%
 
Space Separator273511.4%
 
Other Punctuation18997.9%
 
Open Punctuation1440.6%
 
Close Punctuation1440.6%
 
Dash Punctuation130.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M112830.9%
 
A2506.9%
 
J2155.9%
 
H2035.6%
 
S1804.9%
 
C1724.7%
 
E1664.6%
 
W1433.9%
 
B1403.8%
 
L1293.5%
 
G1143.1%
 
R1123.1%
 
P1103.0%
 
F1032.8%
 
D992.7%
 
T862.4%
 
K722.0%
 
N701.9%
 
O451.2%
 
V441.2%
 
I340.9%
 
Y130.4%
 
Z70.2%
 
U50.1%
 
Q50.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
r195812.7%
 
e170311.0%
 
a165710.7%
 
i13258.6%
 
n13048.4%
 
s12978.4%
 
l10676.9%
 
o10086.5%
 
t6674.3%
 
h5173.3%
 
d4853.1%
 
m3862.5%
 
u3412.2%
 
c2841.8%
 
y2511.6%
 
g2351.5%
 
k2191.4%
 
b1671.1%
 
f1591.0%
 
v1240.8%
 
w990.6%
 
p890.6%
 
z440.3%
 
j300.2%
 
x250.2%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.89247.0%
 
,89146.9%
 
"1065.6%
 
'90.5%
 
/10.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
2735100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(144100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)144100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-13100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin1909179.5%
 
Common493520.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
r195810.3%
 
e17038.9%
 
a16578.7%
 
i13256.9%
 
n13046.8%
 
s12976.8%
 
M11285.9%
 
l10675.6%
 
o10085.3%
 
t6673.5%
 
h5172.7%
 
d4852.5%
 
m3862.0%
 
u3411.8%
 
c2841.5%
 
y2511.3%
 
A2501.3%
 
g2351.2%
 
k2191.1%
 
J2151.1%
 
H2031.1%
 
S1800.9%
 
C1720.9%
 
b1670.9%
 
E1660.9%
 
Other values (26)190610.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
273555.4%
 
.89218.1%
 
,89118.1%
 
(1442.9%
 
)1442.9%
 
"1062.1%
 
-130.3%
 
'90.2%
 
/1< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII24026100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
273511.4%
 
r19588.1%
 
e17037.1%
 
a16576.9%
 
i13255.5%
 
n13045.4%
 
s12975.4%
 
M11284.7%
 
l10674.4%
 
o10084.2%
 
.8923.7%
 
,8913.7%
 
t6672.8%
 
h5172.2%
 
d4852.0%
 
m3861.6%
 
u3411.4%
 
c2841.2%
 
y2511.0%
 
A2501.0%
 
g2351.0%
 
k2190.9%
 
J2150.9%
 
H2030.8%
 
S1800.7%
 
Other values (35)282811.8%
 

Sex
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
male
577 
female
314 
ValueCountFrequency (%) 
male57764.8%
 
female31435.2%
 
2020-10-25T20:10:22.609411image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-25T20:10:22.713673image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-25T20:10:22.845982image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length6
Median length4
Mean length4.704826038
Min length4

Overview of Unicode Properties

Unique unicode characters5
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e120528.7%
 
m89121.3%
 
a89121.3%
 
l89121.3%
 
f3147.5%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter4192100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e120528.7%
 
m89121.3%
 
a89121.3%
 
l89121.3%
 
f3147.5%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin4192100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e120528.7%
 
m89121.3%
 
a89121.3%
 
l89121.3%
 
f3147.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII4192100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e120528.7%
 
m89121.3%
 
a89121.3%
 
l89121.3%
 
f3147.5%
 

Age
Real number (ℝ≥0)

MISSING

Distinct88
Distinct (%)12.3%
Missing177
Missing (%)19.9%
Infinite0
Infinite (%)0.0%
Mean29.69911765
Minimum0.42
Maximum80
Zeros0
Zeros (%)0.0%
Memory size7.1 KiB
2020-10-25T20:10:23.055182image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile4
Q120.125
median28
Q338
95-th percentile56
Maximum80
Range79.58
Interquartile range (IQR)17.875

Descriptive statistics

Standard deviation14.52649733
Coefficient of variation (CV)0.4891221855
Kurtosis0.1782741536
Mean29.69911765
Median Absolute Deviation (MAD)9
Skewness0.3891077823
Sum21205.17
Variance211.0191247
MonotocityNot monotonic
2020-10-25T20:10:23.287899image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
24303.4%
 
22273.0%
 
18262.9%
 
28252.8%
 
19252.8%
 
30252.8%
 
21242.7%
 
25232.6%
 
36222.5%
 
29202.2%
 
32182.0%
 
26182.0%
 
35182.0%
 
27182.0%
 
16171.9%
 
31171.9%
 
34151.7%
 
23151.7%
 
33151.7%
 
20151.7%
 
39141.6%
 
17131.5%
 
42131.5%
 
40131.5%
 
45121.3%
 
Other values (63)23626.5%
 
(Missing)17719.9%
 
ValueCountFrequency (%) 
0.4210.1%
 
0.6710.1%
 
0.7520.2%
 
0.8320.2%
 
0.9210.1%
 
170.8%
 
2101.1%
 
360.7%
 
4101.1%
 
540.4%
 
ValueCountFrequency (%) 
8010.1%
 
7410.1%
 
7120.2%
 
70.510.1%
 
7020.2%
 
6610.1%
 
6530.3%
 
6420.2%
 
6320.2%
 
6240.4%
 

SibSp
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5230078563
Minimum0
Maximum8
Zeros608
Zeros (%)68.2%
Memory size7.1 KiB
2020-10-25T20:10:23.627913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.102743432
Coefficient of variation (CV)2.108464374
Kurtosis17.88041973
Mean0.5230078563
Median Absolute Deviation (MAD)0
Skewness3.695351727
Sum466
Variance1.216043077
MonotocityNot monotonic
2020-10-25T20:10:23.777657image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
060868.2%
 
120923.5%
 
2283.1%
 
4182.0%
 
3161.8%
 
870.8%
 
550.6%
 
ValueCountFrequency (%) 
060868.2%
 
120923.5%
 
2283.1%
 
3161.8%
 
4182.0%
 
550.6%
 
870.8%
 
ValueCountFrequency (%) 
870.8%
 
550.6%
 
4182.0%
 
3161.8%
 
2283.1%
 
120923.5%
 
060868.2%
 

Parch
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3815937149
Minimum0
Maximum6
Zeros678
Zeros (%)76.1%
Memory size7.1 KiB
2020-10-25T20:10:23.939255image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8060572211
Coefficient of variation (CV)2.112344071
Kurtosis9.778125179
Mean0.3815937149
Median Absolute Deviation (MAD)0
Skewness2.749117047
Sum340
Variance0.6497282437
MonotocityNot monotonic
2020-10-25T20:10:24.073975image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
067876.1%
 
111813.2%
 
2809.0%
 
550.6%
 
350.6%
 
440.4%
 
610.1%
 
ValueCountFrequency (%) 
067876.1%
 
111813.2%
 
2809.0%
 
350.6%
 
440.4%
 
550.6%
 
610.1%
 
ValueCountFrequency (%) 
610.1%
 
550.6%
 
440.4%
 
350.6%
 
2809.0%
 
111813.2%
 
067876.1%
 

Ticket
Categorical

HIGH CARDINALITY
UNIFORM

Distinct681
Distinct (%)76.4%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
1601
 
7
CA. 2343
 
7
347082
 
7
3101295
 
6
CA 2144
 
6
Other values (676)
858 
ValueCountFrequency (%) 
160170.8%
 
CA. 234370.8%
 
34708270.8%
 
310129560.7%
 
CA 214460.7%
 
34708860.7%
 
S.O.C. 1487950.6%
 
38265250.6%
 
1995040.4%
 
1742140.4%
 
PC 1775740.4%
 
W./C. 660840.4%
 
11378140.4%
 
34707740.4%
 
11376040.4%
 
34990940.4%
 
266640.4%
 
LINE40.4%
 
413340.4%
 
F.C.C. 1352930.3%
 
C.A. 3192130.3%
 
1350230.3%
 
34774230.3%
 
PC 1757230.3%
 
34577330.3%
 
Other values (656)78087.5%
 
2020-10-25T20:10:24.287400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique547 ?
Unique (%)61.4%
2020-10-25T20:10:24.502538image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length18
Median length6
Mean length6.750841751
Min length3

Overview of Unicode Properties

Unique unicode characters35
Unique unicode categories5 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
374612.4%
 
168911.5%
 
25949.9%
 
74908.1%
 
44647.7%
 
64227.0%
 
04066.7%
 
53876.4%
 
93285.5%
 
82824.7%
 
2394.0%
 
.1973.3%
 
C1512.5%
 
O1001.7%
 
/981.6%
 
P981.6%
 
A821.4%
 
S741.2%
 
N400.7%
 
T360.6%
 
W160.3%
 
Q150.2%
 
I110.2%
 
E70.1%
 
R70.1%
 
Other values (10)360.6%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number480879.9%
 
Uppercase Letter65210.8%
 
Other Punctuation2954.9%
 
Space Separator2394.0%
 
Lowercase Letter210.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C15123.2%
 
O10015.3%
 
P9815.0%
 
A8212.6%
 
S7411.3%
 
N406.1%
 
T365.5%
 
W162.5%
 
Q152.3%
 
I111.7%
 
E71.1%
 
R71.1%
 
F71.1%
 
L40.6%
 
H30.5%
 
B10.2%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.19766.8%
 
/9833.2%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
374615.5%
 
168914.3%
 
259412.4%
 
749010.2%
 
44649.7%
 
64228.8%
 
04068.4%
 
53878.0%
 
93286.8%
 
82825.9%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
239100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a628.6%
 
s523.8%
 
r419.0%
 
i419.0%
 
l14.8%
 
e14.8%
 

Most occurring scripts

ValueCountFrequency (%) 
Common534288.8%
 
Latin67311.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
C15122.4%
 
O10014.9%
 
P9814.6%
 
A8212.2%
 
S7411.0%
 
N405.9%
 
T365.3%
 
W162.4%
 
Q152.2%
 
I111.6%
 
E71.0%
 
R71.0%
 
F71.0%
 
a60.9%
 
s50.7%
 
r40.6%
 
i40.6%
 
L40.6%
 
H30.4%
 
B10.1%
 
l10.1%
 
e10.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
374614.0%
 
168912.9%
 
259411.1%
 
74909.2%
 
44648.7%
 
64227.9%
 
04067.6%
 
53877.2%
 
93286.1%
 
82825.3%
 
2394.5%
 
.1973.7%
 
/981.8%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII6015100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
374612.4%
 
168911.5%
 
25949.9%
 
74908.1%
 
44647.7%
 
64227.0%
 
04066.7%
 
53876.4%
 
93285.5%
 
82824.7%
 
2394.0%
 
.1973.3%
 
C1512.5%
 
O1001.7%
 
/981.6%
 
P981.6%
 
A821.4%
 
S741.2%
 
N400.7%
 
T360.6%
 
W160.3%
 
Q150.2%
 
I110.2%
 
E70.1%
 
R70.1%
 
Other values (10)360.6%
 

Fare
Real number (ℝ≥0)

ZEROS

Distinct248
Distinct (%)27.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.20420797
Minimum0
Maximum512.3292
Zeros15
Zeros (%)1.7%
Memory size7.1 KiB
2020-10-25T20:10:24.705703image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.225
Q17.9104
median14.4542
Q331
95-th percentile112.07915
Maximum512.3292
Range512.3292
Interquartile range (IQR)23.0896

Descriptive statistics

Standard deviation49.6934286
Coefficient of variation (CV)1.543072528
Kurtosis33.39814088
Mean32.20420797
Median Absolute Deviation (MAD)6.9042
Skewness4.78731652
Sum28693.9493
Variance2469.436846
MonotocityNot monotonic
2020-10-25T20:10:24.928037image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
8.05434.8%
 
13424.7%
 
7.8958384.3%
 
7.75343.8%
 
26313.5%
 
10.5242.7%
 
7.925182.0%
 
7.775161.8%
 
26.55151.7%
 
0151.7%
 
7.2292151.7%
 
7.8542131.5%
 
8.6625131.5%
 
7.25131.5%
 
7.225121.3%
 
16.191.0%
 
9.591.0%
 
24.1580.9%
 
15.580.9%
 
56.495870.8%
 
5270.8%
 
14.570.8%
 
14.454270.8%
 
69.5570.8%
 
7.0570.8%
 
Other values (223)47353.1%
 
ValueCountFrequency (%) 
0151.7%
 
4.012510.1%
 
510.1%
 
6.237510.1%
 
6.437510.1%
 
6.4510.1%
 
6.495820.2%
 
6.7520.2%
 
6.858310.1%
 
6.9510.1%
 
ValueCountFrequency (%) 
512.329230.3%
 
26340.4%
 
262.37520.2%
 
247.520820.2%
 
227.52540.4%
 
221.779210.1%
 
211.510.1%
 
211.337530.3%
 
164.866720.2%
 
153.462530.3%
 

Cabin
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct147
Distinct (%)72.1%
Missing687
Missing (%)77.1%
Memory size7.1 KiB
B96 B98
 
4
C23 C25 C27
 
4
G6
 
4
F33
 
3
D
 
3
Other values (142)
186 
ValueCountFrequency (%) 
B96 B9840.4%
 
C23 C25 C2740.4%
 
G640.4%
 
F3330.3%
 
D30.3%
 
E10130.3%
 
F230.3%
 
C22 C2630.3%
 
D3520.2%
 
C5220.2%
 
D2620.2%
 
D2020.2%
 
E820.2%
 
C9220.2%
 
B2820.2%
 
F G7320.2%
 
D1720.2%
 
C12320.2%
 
E3320.2%
 
D3620.2%
 
B4920.2%
 
E2420.2%
 
F420.2%
 
B58 B6020.2%
 
C220.2%
 
Other values (122)14316.0%
 
(Missing)68777.1%
 
2020-10-25T20:10:25.170166image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique101 ?
Unique (%)49.5%
2020-10-25T20:10:25.378436image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length15
Median length3
Mean length3.134680135
Min length1

Overview of Unicode Properties

Unique unicode characters21
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n137449.2%
 
a68724.6%
 
2722.6%
 
C712.5%
 
B642.3%
 
1612.2%
 
3592.1%
 
6511.8%
 
5451.6%
 
8371.3%
 
4371.3%
 
D341.2%
 
341.2%
 
7341.2%
 
E331.2%
 
9331.2%
 
0311.1%
 
A150.5%
 
F130.5%
 
G70.3%
 
T1< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter206173.8%
 
Decimal Number46016.5%
 
Uppercase Letter2388.5%
 
Space Separator341.2%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n137466.7%
 
a68733.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C7129.8%
 
B6426.9%
 
D3414.3%
 
E3313.9%
 
A156.3%
 
F135.5%
 
G72.9%
 
T10.4%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
27215.7%
 
16113.3%
 
35912.8%
 
65111.1%
 
5459.8%
 
8378.0%
 
4378.0%
 
7347.4%
 
9337.2%
 
0316.7%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
34100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin229982.3%
 
Common49417.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n137459.8%
 
a68729.9%
 
C713.1%
 
B642.8%
 
D341.5%
 
E331.4%
 
A150.7%
 
F130.6%
 
G70.3%
 
T1< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
27214.6%
 
16112.3%
 
35911.9%
 
65110.3%
 
5459.1%
 
8377.5%
 
4377.5%
 
346.9%
 
7346.9%
 
9336.7%
 
0316.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2793100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n137449.2%
 
a68724.6%
 
2722.6%
 
C712.5%
 
B642.3%
 
1612.2%
 
3592.1%
 
6511.8%
 
5451.6%
 
8371.3%
 
4371.3%
 
D341.2%
 
341.2%
 
7341.2%
 
E331.2%
 
9331.2%
 
0311.1%
 
A150.5%
 
F130.5%
 
G70.3%
 
T1< 0.1%
 

Embarked
Categorical

Distinct3
Distinct (%)0.3%
Missing2
Missing (%)0.2%
Memory size7.1 KiB
S
644 
C
168 
Q
77 
ValueCountFrequency (%) 
S64472.3%
 
C16818.9%
 
Q778.6%
 
(Missing)20.2%
 
2020-10-25T20:10:25.557308image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-25T20:10:25.672834image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-25T20:10:25.800218image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length1
Mean length1.004489338
Min length1

Overview of Unicode Properties

Unique unicode characters5
Unique unicode categories2 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
S64472.0%
 
C16818.8%
 
Q778.6%
 
n40.4%
 
a20.2%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter88999.3%
 
Lowercase Letter60.7%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
S64472.4%
 
C16818.9%
 
Q778.7%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n466.7%
 
a233.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin895100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
S64472.0%
 
C16818.8%
 
Q778.6%
 
n40.4%
 
a20.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII895100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
S64472.0%
 
C16818.8%
 
Q778.6%
 
n40.4%
 
a20.2%
 

Interactions

2020-10-25T20:10:15.433735image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-25T20:10:15.605837image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-10-25T20:10:15.792049image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/