Overview

Dataset statistics

Number of variables25
Number of observations87
Missing cells513
Missing cells (%)23.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory115.3 KiB
Average record size in memory1.3 KiB

Variable types

Text9
Numeric4
Categorical11
URL1

Alerts

abilities is highly overall correlated with sexHigh correlation
birth_era is highly overall correlated with filmsHigh correlation
birth_year is highly overall correlated with sex and 1 other fieldsHigh correlation
death_era is highly overall correlated with filmsHigh correlation
films is highly overall correlated with birth_era and 2 other fieldsHigh correlation
gender is highly overall correlated with pronoun and 1 other fieldsHigh correlation
height is highly overall correlated with mass and 1 other fieldsHigh correlation
mass is highly overall correlated with height and 3 other fieldsHigh correlation
pronoun is highly overall correlated with gender and 1 other fieldsHigh correlation
sex is highly overall correlated with abilities and 7 other fieldsHigh correlation
skin_color is highly overall correlated with mass and 2 other fieldsHigh correlation
species is highly overall correlated with birth_year and 4 other fieldsHigh correlation
birth_era is highly imbalanced (53.1%)Imbalance
height has 1 (1.1%) missing valuesMissing
mass has 22 (25.3%) missing valuesMissing
hair_color has 5 (5.7%) missing valuesMissing
skin_color has 1 (1.1%) missing valuesMissing
birth_year has 37 (42.5%) missing valuesMissing
birth_era has 37 (42.5%) missing valuesMissing
birth_place has 50 (57.5%) missing valuesMissing
death_year has 25 (28.7%) missing valuesMissing
death_era has 25 (28.7%) missing valuesMissing
death_place has 30 (34.5%) missing valuesMissing
homeworld has 4 (4.6%) missing valuesMissing
cybernetics has 80 (92.0%) missing valuesMissing
abilities has 32 (36.8%) missing valuesMissing
equipment has 25 (28.7%) missing valuesMissing
vehicles has 72 (82.8%) missing valuesMissing
starships has 67 (77.0%) missing valuesMissing
name has unique valuesUnique
photo has unique valuesUnique
death_year has 10 (11.5%) zerosZeros

Reproduction

Analysis started2023-12-30 08:20:12.093341
Analysis finished2023-12-30 08:20:13.802663
Duration1.71 second
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

name
Text

UNIQUE 

Distinct87
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size6.0 KiB
2023-12-30T09:20:13.993183image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length21
Median length15
Mean length10.804598
Min length4

Characters and Unicode

Total characters940
Distinct characters60
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique87 ?
Unique (%)100.0%

Sample

1st rowLuke Skywalker
2nd rowC-3PO
3rd rowR2-D2
4th rowDarth Vader
5th rowLeia Organa Solo
ValueCountFrequency (%)
lars 4
 
2.5%
skywalker 4
 
2.5%
fett 2
 
1.2%
organa 2
 
1.2%
solo 2
 
1.2%
antilles 2
 
1.2%
darth 2
 
1.2%
jar 2
 
1.2%
kenobi 1
 
0.6%
biggs 1
 
0.6%
Other values (139) 139
86.3%
2023-12-30T09:20:14.272284image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 97
 
10.3%
74
 
7.9%
e 63
 
6.7%
r 57
 
6.1%
i 56
 
6.0%
o 52
 
5.5%
n 47
 
5.0%
s 43
 
4.6%
l 41
 
4.4%
t 35
 
3.7%
Other values (50) 375
39.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 670
71.3%
Uppercase Letter 173
 
18.4%
Space Separator 74
 
7.9%
Decimal Number 11
 
1.2%
Dash Punctuation 10
 
1.1%
Other Punctuation 2
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 97
14.5%
e 63
9.4%
r 57
 
8.5%
i 56
 
8.4%
o 52
 
7.8%
n 47
 
7.0%
s 43
 
6.4%
l 41
 
6.1%
t 35
 
5.2%
u 29
 
4.3%
Other values (15) 150
22.4%
Uppercase Letter
ValueCountFrequency (%)
S 14
 
8.1%
B 13
 
7.5%
T 12
 
6.9%
W 12
 
6.9%
P 12
 
6.9%
D 11
 
6.4%
L 11
 
6.4%
A 10
 
5.8%
G 9
 
5.2%
R 9
 
5.2%
Other values (15) 60
34.7%
Decimal Number
ValueCountFrequency (%)
8 3
27.3%
4 2
18.2%
2 2
18.2%
1 1
 
9.1%
7 1
 
9.1%
3 1
 
9.1%
5 1
 
9.1%
Space Separator
ValueCountFrequency (%)
74
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%
Other Punctuation
ValueCountFrequency (%)
' 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 843
89.7%
Common 97
 
10.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 97
 
11.5%
e 63
 
7.5%
r 57
 
6.8%
i 56
 
6.6%
o 52
 
6.2%
n 47
 
5.6%
s 43
 
5.1%
l 41
 
4.9%
t 35
 
4.2%
u 29
 
3.4%
Other values (40) 323
38.3%
Common
ValueCountFrequency (%)
74
76.3%
- 10
 
10.3%
8 3
 
3.1%
4 2
 
2.1%
2 2
 
2.1%
' 2
 
2.1%
1 1
 
1.0%
7 1
 
1.0%
3 1
 
1.0%
5 1
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 936
99.6%
None 4
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 97
 
10.4%
74
 
7.9%
e 63
 
6.7%
r 57
 
6.1%
i 56
 
6.0%
o 52
 
5.6%
n 47
 
5.0%
s 43
 
4.6%
l 41
 
4.4%
t 35
 
3.7%
Other values (49) 371
39.6%
None
ValueCountFrequency (%)
é 4
100.0%

height
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct48
Distinct (%)55.8%
Missing1
Missing (%)1.1%
Infinite0
Infinite (%)0.0%
Mean173.61628
Minimum66
Maximum264
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size828.0 B
2023-12-30T09:20:14.349969image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum66
5-th percentile94.5
Q1167
median180
Q3191
95-th percentile222
Maximum264
Range198
Interquartile range (IQR)24

Descriptive statistics

Standard deviation36.141281
Coefficient of variation (CV)0.20816758
Kurtosis2.1353603
Mean173.61628
Median Absolute Deviation (MAD)12.5
Skewness-1.1850659
Sum14931
Variance1306.1922
MonotonicityNot monotonic
2023-12-30T09:20:14.419835image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
183 7
 
8.0%
188 5
 
5.7%
170 5
 
5.7%
196 4
 
4.6%
178 4
 
4.6%
180 4
 
4.6%
191 3
 
3.4%
175 3
 
3.4%
165 3
 
3.4%
163 2
 
2.3%
Other values (38) 46
52.9%
ValueCountFrequency (%)
66 1
1.1%
67 1
1.1%
79 1
1.1%
80 1
1.1%
94 1
1.1%
96 2
2.3%
97 1
1.1%
112 1
1.1%
122 1
1.1%
137 1
1.1%
ValueCountFrequency (%)
264 1
1.1%
234 1
1.1%
229 1
1.1%
228 1
1.1%
224 1
1.1%
216 1
1.1%
213 1
1.1%
206 2
2.3%
202 1
1.1%
201 1
1.1%

mass
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct43
Distinct (%)66.2%
Missing22
Missing (%)25.3%
Infinite0
Infinite (%)0.0%
Mean94.353846
Minimum15
Maximum1358
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size828.0 B
2023-12-30T09:20:14.485200image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile22.4
Q155
median79
Q384
95-th percentile136
Maximum1358
Range1343
Interquartile range (IQR)29

Descriptive statistics

Standard deviation161.754
Coefficient of variation (CV)1.714334
Kurtosis60.779027
Mean94.353846
Median Absolute Deviation (MAD)11
Skewness7.6742474
Sum6133
Variance26164.357
MonotonicityNot monotonic
2023-12-30T09:20:14.555159image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=43)
ValueCountFrequency (%)
80 7
 
8.0%
79 4
 
4.6%
77 3
 
3.4%
75 3
 
3.4%
84 3
 
3.4%
48 2
 
2.3%
45 2
 
2.3%
50 2
 
2.3%
55 2
 
2.3%
82 2
 
2.3%
Other values (33) 35
40.2%
(Missing) 22
25.3%
ValueCountFrequency (%)
15 1
1.1%
17 1
1.1%
18 1
1.1%
20 1
1.1%
32 2
2.3%
40 1
1.1%
45 2
2.3%
48 2
2.3%
49 1
1.1%
50 2
2.3%
ValueCountFrequency (%)
1358 1
1.1%
159 1
1.1%
140 1
1.1%
136 2
2.3%
120 1
1.1%
113 1
1.1%
112 1
1.1%
110 1
1.1%
102 1
1.1%
91 1
1.1%

hair_color
Categorical

MISSING 

Distinct12
Distinct (%)14.6%
Missing5
Missing (%)5.7%
Memory size5.2 KiB
none
36 
brown
16 
black
11 
blond
white
Other values (7)
11 

Length

Max length11
Median length10
Mean length4.8780488
Min length3

Characters and Unicode

Total characters400
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)6.1%

Sample

1st rowblond
2nd rowsandy-blond
3rd rowdark brown
4th rowbrown
5th rowbrown

Common Values

ValueCountFrequency (%)
none 36
41.4%
brown 16
18.4%
black 11
 
12.6%
blond 4
 
4.6%
white 4
 
4.6%
dark brown 3
 
3.4%
auburn 3
 
3.4%
sandy-blond 1
 
1.1%
red 1
 
1.1%
light brown 1
 
1.1%
Other values (2) 2
 
2.3%
(Missing) 5
 
5.7%

Length

2023-12-30T09:20:14.621977image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 36
41.9%
brown 20
23.3%
black 11
 
12.8%
blond 4
 
4.7%
white 4
 
4.7%
dark 3
 
3.5%
auburn 3
 
3.5%
sandy-blond 1
 
1.2%
red 1
 
1.2%
light 1
 
1.2%
Other values (2) 2
 
2.3%

Most occurring characters

ValueCountFrequency (%)
n 101
25.2%
o 62
15.5%
e 41
10.2%
b 39
 
9.8%
r 28
 
7.0%
w 24
 
6.0%
a 19
 
4.8%
l 18
 
4.5%
k 14
 
3.5%
d 11
 
2.8%
Other values (10) 43
10.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 395
98.8%
Space Separator 4
 
1.0%
Dash Punctuation 1
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 101
25.6%
o 62
15.7%
e 41
10.4%
b 39
 
9.9%
r 28
 
7.1%
w 24
 
6.1%
a 19
 
4.8%
l 18
 
4.6%
k 14
 
3.5%
d 11
 
2.8%
Other values (8) 38
 
9.6%
Space Separator
ValueCountFrequency (%)
4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 395
98.8%
Common 5
 
1.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 101
25.6%
o 62
15.7%
e 41
10.4%
b 39
 
9.9%
r 28
 
7.1%
w 24
 
6.1%
a 19
 
4.8%
l 18
 
4.6%
k 14
 
3.5%
d 11
 
2.8%
Other values (8) 38
 
9.6%
Common
ValueCountFrequency (%)
4
80.0%
- 1
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 400
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 101
25.2%
o 62
15.5%
e 41
10.2%
b 39
 
9.8%
r 28
 
7.0%
w 24
 
6.0%
a 19
 
4.8%
l 18
 
4.5%
k 14
 
3.5%
d 11
 
2.8%
Other values (10) 43
10.8%

skin_color
Categorical

HIGH CORRELATION  MISSING 

Distinct33
Distinct (%)38.4%
Missing1
Missing (%)1.1%
Memory size5.5 KiB
light
19 
fair
10 
green
tan
blue
Other values (28)
39 

Length

Max length24
Median length19
Mean length6.3139535
Min length3

Characters and Unicode

Total characters543
Distinct characters24
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)24.4%

Sample

1st rowlight
2nd rowgold
3rd rowblue, silver, white
4th rowpale
5th rowlight

Common Values

ValueCountFrequency (%)
light 19
21.8%
fair 10
 
11.5%
green 8
 
9.2%
tan 6
 
6.9%
blue 4
 
4.6%
dark 4
 
4.6%
white 3
 
3.4%
pale 3
 
3.4%
orange 2
 
2.3%
yellow 2
 
2.3%
Other values (23) 25
28.7%

Length

2023-12-30T09:20:14.687313image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
light 20
18.7%
white 11
10.3%
fair 10
9.3%
green 9
 
8.4%
blue 7
 
6.5%
tan 6
 
5.6%
brown 6
 
5.6%
dark 5
 
4.7%
red 5
 
4.7%
orange 4
 
3.7%
Other values (14) 24
22.4%

Most occurring characters

ValueCountFrequency (%)
e 65
12.0%
l 48
 
8.8%
r 48
 
8.8%
t 45
 
8.3%
i 45
 
8.3%
g 40
 
7.4%
a 36
 
6.6%
h 31
 
5.7%
n 27
 
5.0%
w 21
 
3.9%
Other values (14) 137
25.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 505
93.0%
Space Separator 21
 
3.9%
Other Punctuation 15
 
2.8%
Dash Punctuation 2
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 65
12.9%
l 48
9.5%
r 48
9.5%
t 45
8.9%
i 45
8.9%
g 40
7.9%
a 36
 
7.1%
h 31
 
6.1%
n 27
 
5.3%
w 21
 
4.2%
Other values (11) 99
19.6%
Space Separator
ValueCountFrequency (%)
21
100.0%
Other Punctuation
ValueCountFrequency (%)
, 15
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 505
93.0%
Common 38
 
7.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 65
12.9%
l 48
9.5%
r 48
9.5%
t 45
8.9%
i 45
8.9%
g 40
7.9%
a 36
 
7.1%
h 31
 
6.1%
n 27
 
5.3%
w 21
 
4.2%
Other values (11) 99
19.6%
Common
ValueCountFrequency (%)
21
55.3%
, 15
39.5%
- 2
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 543
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 65
12.0%
l 48
 
8.8%
r 48
 
8.8%
t 45
 
8.3%
i 45
 
8.3%
g 40
 
7.4%
a 36
 
6.6%
h 31
 
5.7%
n 27
 
5.0%
w 21
 
3.9%
Other values (14) 137
25.2%

eye_color
Categorical

Distinct16
Distinct (%)18.4%
Missing0
Missing (%)0.0%
Memory size5.4 KiB
blue
19 
brown
18 
black
11 
orange
yellow
Other values (11)
22 

Length

Max length14
Median length12
Mean length5.2988506
Min length3

Characters and Unicode

Total characters461
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)8.0%

Sample

1st rowblue
2nd rowyellow
3rd rowred
4th rowyellow
5th rowbrown

Common Values

ValueCountFrequency (%)
blue 19
21.8%
brown 18
20.7%
black 11
12.6%
orange 9
10.3%
yellow 8
9.2%
red 7
 
8.0%
hazel 4
 
4.6%
blue-gray 2
 
2.3%
gold 2
 
2.3%
green-gold 1
 
1.1%
Other values (6) 6
 
6.9%

Length

2023-12-30T09:20:14.748972image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
blue 20
22.0%
brown 19
20.9%
black 12
13.2%
orange 9
9.9%
yellow 8
 
8.8%
red 8
 
8.8%
hazel 4
 
4.4%
blue-gray 2
 
2.2%
gold 2
 
2.2%
green-gold 1
 
1.1%
Other values (6) 6
 
6.6%

Most occurring characters

ValueCountFrequency (%)
l 60
13.0%
e 57
12.4%
b 55
11.9%
r 43
9.3%
o 41
8.9%
n 31
6.7%
w 30
 
6.5%
a 29
 
6.3%
u 23
 
5.0%
g 17
 
3.7%
Other values (12) 75
16.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 452
98.0%
Dash Punctuation 4
 
0.9%
Space Separator 4
 
0.9%
Other Punctuation 1
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 60
13.3%
e 57
12.6%
b 55
12.2%
r 43
9.5%
o 41
9.1%
n 31
6.9%
w 30
6.6%
a 29
6.4%
u 23
 
5.1%
g 17
 
3.8%
Other values (9) 66
14.6%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 452
98.0%
Common 9
 
2.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 60
13.3%
e 57
12.6%
b 55
12.2%
r 43
9.5%
o 41
9.1%
n 31
6.9%
w 30
6.6%
a 29
6.4%
u 23
 
5.1%
g 17
 
3.8%
Other values (9) 66
14.6%
Common
ValueCountFrequency (%)
- 4
44.4%
4
44.4%
, 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 461
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 60
13.0%
e 57
12.4%
b 55
11.9%
r 43
9.3%
o 41
8.9%
n 31
6.7%
w 30
 
6.5%
a 29
 
6.3%
u 23
 
5.0%
g 17
 
3.7%
Other values (12) 75
16.3%

birth_year
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct38
Distinct (%)76.0%
Missing37
Missing (%)42.5%
Infinite0
Infinite (%)0.0%
Mean78.94
Minimum2
Maximum896
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size828.0 B
2023-12-30T09:20:14.809469image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile9.35
Q129
median47.5
Q370.75
95-th percentile160.4
Maximum896
Range894
Interquartile range (IQR)41.75

Descriptive statistics

Standard deviation145.35704
Coefficient of variation (CV)1.8413611
Kurtosis23.66993
Mean78.94
Median Absolute Deviation (MAD)21.5
Skewness4.7374785
Sum3947
Variance21128.67
MonotonicityNot monotonic
2023-12-30T09:20:14.875937image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
19 3
 
3.4%
82 3
 
3.4%
41 3
 
3.4%
48 2
 
2.3%
72 2
 
2.3%
15 2
 
2.3%
31 2
 
2.3%
29 2
 
2.3%
52 2
 
2.3%
67 1
 
1.1%
Other values (28) 28
32.2%
(Missing) 37
42.5%
ValueCountFrequency (%)
2 1
 
1.1%
6 1
 
1.1%
8 1
 
1.1%
11 1
 
1.1%
15 2
2.3%
19 3
3.4%
21 1
 
1.1%
22 1
 
1.1%
24 1
 
1.1%
29 2
2.3%
ValueCountFrequency (%)
896 1
 
1.1%
600 1
 
1.1%
200 1
 
1.1%
112 1
 
1.1%
110 1
 
1.1%
102 1
 
1.1%
93 1
 
1.1%
92 1
 
1.1%
82 3
3.4%
72 2
2.3%

birth_era
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)4.0%
Missing37
Missing (%)42.5%
Memory size4.2 KiB
BBY
45 
ABY

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters150
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBBY
2nd rowBBY
3rd rowBBY
4th rowBBY
5th rowBBY

Common Values

ValueCountFrequency (%)
BBY 45
51.7%
ABY 5
 
5.7%
(Missing) 37
42.5%

Length

2023-12-30T09:20:15.037469image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-30T09:20:15.082805image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
bby 45
90.0%
aby 5
 
10.0%

Most occurring characters

ValueCountFrequency (%)
B 95
63.3%
Y 50
33.3%
A 5
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 150
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B 95
63.3%
Y 50
33.3%
A 5
 
3.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 150
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 95
63.3%
Y 50
33.3%
A 5
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 150
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 95
63.3%
Y 50
33.3%
A 5
 
3.3%

birth_place
Text

MISSING 

Distinct30
Distinct (%)81.1%
Missing50
Missing (%)57.5%
Memory size4.0 KiB
2023-12-30T09:20:15.186082image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length18
Median length11
Mean length7.7297297
Min length4

Characters and Unicode

Total characters286
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)70.3%

Sample

1st rowPolis Massa
2nd rowAffa
3rd rowTatooine
4th rowPolis Massa
5th rowAtor
ValueCountFrequency (%)
tatooine 4
 
8.7%
naboo 3
 
6.5%
polis 2
 
4.3%
massa 2
 
4.3%
coruscant 2
 
4.3%
nal 1
 
2.2%
cala 1
 
2.2%
chandrila 1
 
2.2%
affa 1
 
2.2%
socorro 1
 
2.2%
Other values (28) 28
60.9%
2023-12-30T09:20:15.374935image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 43
15.0%
o 36
12.6%
r 23
 
8.0%
n 22
 
7.7%
i 16
 
5.6%
s 14
 
4.9%
t 13
 
4.5%
e 12
 
4.2%
l 10
 
3.5%
9
 
3.1%
Other values (28) 88
30.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 232
81.1%
Uppercase Letter 44
 
15.4%
Space Separator 9
 
3.1%
Decimal Number 1
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 43
18.5%
o 36
15.5%
r 23
9.9%
n 22
9.5%
i 16
 
6.9%
s 14
 
6.0%
t 13
 
5.6%
e 12
 
5.2%
l 10
 
4.3%
u 7
 
3.0%
Other values (12) 36
15.5%
Uppercase Letter
ValueCountFrequency (%)
C 7
15.9%
T 5
11.4%
M 5
11.4%
S 4
9.1%
N 4
9.1%
H 3
6.8%
P 3
6.8%
A 3
6.8%
K 3
6.8%
D 2
 
4.5%
Other values (4) 5
11.4%
Space Separator
ValueCountFrequency (%)
9
100.0%
Decimal Number
ValueCountFrequency (%)
4 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 276
96.5%
Common 10
 
3.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 43
15.6%
o 36
13.0%
r 23
 
8.3%
n 22
 
8.0%
i 16
 
5.8%
s 14
 
5.1%
t 13
 
4.7%
e 12
 
4.3%
l 10
 
3.6%
u 7
 
2.5%
Other values (26) 80
29.0%
Common
ValueCountFrequency (%)
9
90.0%
4 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 286
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 43
15.0%
o 36
12.6%
r 23
 
8.0%
n 22
 
7.7%
i 16
 
5.6%
s 14
 
4.9%
t 13
 
4.5%
e 12
 
4.2%
l 10
 
3.5%
9
 
3.1%
Other values (28) 88
30.8%

death_year
Real number (ℝ)

MISSING  ZEROS 

Distinct19
Distinct (%)30.6%
Missing25
Missing (%)28.7%
Infinite0
Infinite (%)0.0%
Mean16.370968
Minimum0
Maximum45
Zeros10
Zeros (%)11.5%
Negative0
Negative (%)0.0%
Memory size828.0 B
2023-12-30T09:20:15.442524image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q14
median19
Q322
95-th percentile34.95
Maximum45
Range45
Interquartile range (IQR)18

Descriptive statistics

Standard deviation11.627037
Coefficient of variation (CV)0.71022297
Kurtosis-0.71644317
Mean16.370968
Median Absolute Deviation (MAD)8
Skewness0.097364044
Sum1015
Variance135.188
MonotonicityNot monotonic
2023-12-30T09:20:15.492072image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
19 17
19.5%
0 10
 
11.5%
4 5
 
5.7%
3 4
 
4.6%
22 4
 
4.6%
34 3
 
3.4%
20 3
 
3.4%
35 3
 
3.4%
32 2
 
2.3%
18 2
 
2.3%
Other values (9) 9
 
10.3%
(Missing) 25
28.7%
ValueCountFrequency (%)
0 10
11.5%
3 4
 
4.6%
4 5
 
5.7%
9 1
 
1.1%
11 1
 
1.1%
14 1
 
1.1%
18 2
 
2.3%
19 17
19.5%
20 3
 
3.4%
21 1
 
1.1%
ValueCountFrequency (%)
45 1
 
1.1%
35 3
3.4%
34 3
3.4%
32 2
2.3%
29 1
 
1.1%
27 1
 
1.1%
25 1
 
1.1%
24 1
 
1.1%
22 4
4.6%
21 1
 
1.1%

death_era
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)3.2%
Missing25
Missing (%)28.7%
Memory size4.5 KiB
BBY
42 
ABY
20 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters186
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowABY
2nd rowABY
3rd rowBBY
4th rowABY
5th rowABY

Common Values

ValueCountFrequency (%)
BBY 42
48.3%
ABY 20
23.0%
(Missing) 25
28.7%

Length

2023-12-30T09:20:15.545207image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-30T09:20:15.589126image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
bby 42
67.7%
aby 20
32.3%

Most occurring characters

ValueCountFrequency (%)
B 104
55.9%
Y 62
33.3%
A 20
 
10.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 186
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B 104
55.9%
Y 62
33.3%
A 20
 
10.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 186
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 104
55.9%
Y 62
33.3%
A 20
 
10.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 186
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 104
55.9%
Y 62
33.3%
A 20
 
10.8%

death_place
Text

MISSING 

Distinct32
Distinct (%)56.1%
Missing30
Missing (%)34.5%
Memory size4.7 KiB
2023-12-30T09:20:15.701313image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length27
Median length14
Mean length9.0701754
Min length5

Characters and Unicode

Total characters517
Distinct characters44
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)40.4%

Sample

1st rowAhch-To
2nd rowBespin
3rd rowCarida
4th rowDeath Star II
5th rowAjan Kloss
ValueCountFrequency (%)
coruscant 9
 
11.4%
tatooine 8
 
10.1%
system 4
 
5.1%
star 3
 
3.8%
death 3
 
3.8%
mustafar 3
 
3.8%
naboo 3
 
3.8%
felucia 3
 
3.8%
bespin 2
 
2.5%
tantive 2
 
2.5%
Other values (36) 39
49.4%
2023-12-30T09:20:15.895137image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 67
13.0%
o 48
 
9.3%
t 42
 
8.1%
e 33
 
6.4%
n 32
 
6.2%
s 31
 
6.0%
i 31
 
6.0%
r 29
 
5.6%
22
 
4.3%
u 17
 
3.3%
Other values (34) 165
31.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 411
79.5%
Uppercase Letter 81
 
15.7%
Space Separator 22
 
4.3%
Dash Punctuation 2
 
0.4%
Decimal Number 1
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 67
16.3%
o 48
11.7%
t 42
10.2%
e 33
8.0%
n 32
7.8%
s 31
7.5%
i 31
7.5%
r 29
7.1%
u 17
 
4.1%
l 15
 
3.6%
Other values (13) 66
16.1%
Uppercase Letter
ValueCountFrequency (%)
C 13
16.0%
T 11
13.6%
S 8
9.9%
I 7
8.6%
M 6
 
7.4%
D 5
 
6.2%
B 5
 
6.2%
F 4
 
4.9%
N 4
 
4.9%
V 3
 
3.7%
Other values (8) 15
18.5%
Space Separator
ValueCountFrequency (%)
22
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Decimal Number
ValueCountFrequency (%)
1 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 492
95.2%
Common 25
 
4.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 67
13.6%
o 48
 
9.8%
t 42
 
8.5%
e 33
 
6.7%
n 32
 
6.5%
s 31
 
6.3%
i 31
 
6.3%
r 29
 
5.9%
u 17
 
3.5%
l 15
 
3.0%
Other values (31) 147
29.9%
Common
ValueCountFrequency (%)
22
88.0%
- 2
 
8.0%
1 1
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 517
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 67
13.0%
o 48
 
9.3%
t 42
 
8.1%
e 33
 
6.4%
n 32
 
6.2%
s 31
 
6.0%
i 31
 
6.0%
r 29
 
5.6%
22
 
4.3%
u 17
 
3.3%
Other values (34) 165
31.9%

sex
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.6%
Missing0
Missing (%)0.0%
Memory size5.4 KiB
male
62 
female
18 
none
 
6
hermaphroditic
 
1

Length

Max length14
Median length4
Mean length4.5287356
Min length4

Characters and Unicode

Total characters394
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)1.1%

Sample

1st rowmale
2nd rownone
3rd rownone
4th rowmale
5th rowfemale

Common Values

ValueCountFrequency (%)
male 62
71.3%
female 18
 
20.7%
none 6
 
6.9%
hermaphroditic 1
 
1.1%

Length

2023-12-30T09:20:15.970470image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-30T09:20:16.023566image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
male 62
71.3%
female 18
 
20.7%
none 6
 
6.9%
hermaphroditic 1
 
1.1%

Most occurring characters

ValueCountFrequency (%)
e 105
26.6%
m 81
20.6%
a 81
20.6%
l 80
20.3%
f 18
 
4.6%
n 12
 
3.0%
o 7
 
1.8%
h 2
 
0.5%
r 2
 
0.5%
i 2
 
0.5%
Other values (4) 4
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 394
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 105
26.6%
m 81
20.6%
a 81
20.6%
l 80
20.3%
f 18
 
4.6%
n 12
 
3.0%
o 7
 
1.8%
h 2
 
0.5%
r 2
 
0.5%
i 2
 
0.5%
Other values (4) 4
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 394
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 105
26.6%
m 81
20.6%
a 81
20.6%
l 80
20.3%
f 18
 
4.6%
n 12
 
3.0%
o 7
 
1.8%
h 2
 
0.5%
r 2
 
0.5%
i 2
 
0.5%
Other values (4) 4
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 394
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 105
26.6%
m 81
20.6%
a 81
20.6%
l 80
20.3%
f 18
 
4.6%
n 12
 
3.0%
o 7
 
1.8%
h 2
 
0.5%
r 2
 
0.5%
i 2
 
0.5%
Other values (4) 4
 
1.0%

gender
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
masculine
68 
feminine
19 

Length

Max length9
Median length9
Mean length8.7816092
Min length8

Characters and Unicode

Total characters764
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmasculine
2nd rowmasculine
3rd rowmasculine
4th rowmasculine
5th rowfeminine

Common Values

ValueCountFrequency (%)
masculine 68
78.2%
feminine 19
 
21.8%

Length

2023-12-30T09:20:16.082907image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-30T09:20:16.134925image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
masculine 68
78.2%
feminine 19
 
21.8%

Most occurring characters

ValueCountFrequency (%)
i 106
13.9%
n 106
13.9%
e 106
13.9%
m 87
11.4%
a 68
8.9%
s 68
8.9%
c 68
8.9%
u 68
8.9%
l 68
8.9%
f 19
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 764
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 106
13.9%
n 106
13.9%
e 106
13.9%
m 87
11.4%
a 68
8.9%
s 68
8.9%
c 68
8.9%
u 68
8.9%
l 68
8.9%
f 19
 
2.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 764
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 106
13.9%
n 106
13.9%
e 106
13.9%
m 87
11.4%
a 68
8.9%
s 68
8.9%
c 68
8.9%
u 68
8.9%
l 68
8.9%
f 19
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 764
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 106
13.9%
n 106
13.9%
e 106
13.9%
m 87
11.4%
a 68
8.9%
s 68
8.9%
c 68
8.9%
u 68
8.9%
l 68
8.9%
f 19
 
2.5%

pronoun
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
he/him
68 
she/her
19 

Length

Max length7
Median length6
Mean length6.2183908
Min length6

Characters and Unicode

Total characters541
Distinct characters7
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhe/him
2nd rowhe/him
3rd rowhe/him
4th rowhe/him
5th rowshe/her

Common Values

ValueCountFrequency (%)
he/him 68
78.2%
she/her 19
 
21.8%

Length

2023-12-30T09:20:16.190966image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-30T09:20:16.240047image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
he/him 68
78.2%
she/her 19
 
21.8%

Most occurring characters

ValueCountFrequency (%)
h 174
32.2%
e 106
19.6%
/ 87
16.1%
i 68
 
12.6%
m 68
 
12.6%
s 19
 
3.5%
r 19
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 454
83.9%
Other Punctuation 87
 
16.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
h 174
38.3%
e 106
23.3%
i 68
 
15.0%
m 68
 
15.0%
s 19
 
4.2%
r 19
 
4.2%
Other Punctuation
ValueCountFrequency (%)
/ 87
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 454
83.9%
Common 87
 
16.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
h 174
38.3%
e 106
23.3%
i 68
 
15.0%
m 68
 
15.0%
s 19
 
4.2%
r 19
 
4.2%
Common
ValueCountFrequency (%)
/ 87
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 541
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
h 174
32.2%
e 106
19.6%
/ 87
16.1%
i 68
 
12.6%
m 68
 
12.6%
s 19
 
3.5%
r 19
 
3.5%

homeworld
Text

MISSING 

Distinct54
Distinct (%)65.1%
Missing4
Missing (%)4.6%
Memory size5.5 KiB
2023-12-30T09:20:16.367701image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length13
Median length12
Mean length7.1325301
Min length3

Characters and Unicode

Total characters592
Distinct characters46
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)54.2%

Sample

1st rowTatooine
2nd rowTatooine
3rd rowNaboo
4th rowTatooine
5th rowAlderaan
ValueCountFrequency (%)
naboo 11
 
12.0%
tatooine 10
 
10.9%
alderaan 3
 
3.3%
coruscant 3
 
3.3%
kamino 3
 
3.3%
mirial 2
 
2.2%
kashyyyk 2
 
2.2%
corellia 2
 
2.2%
ryloth 2
 
2.2%
nal 1
 
1.1%
Other values (53) 53
57.6%
2023-12-30T09:20:16.575192image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 82
13.9%
o 77
13.0%
n 45
 
7.6%
i 41
 
6.9%
e 38
 
6.4%
r 34
 
5.7%
t 27
 
4.6%
l 27
 
4.6%
s 18
 
3.0%
u 16
 
2.7%
Other values (36) 187
31.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 490
82.8%
Uppercase Letter 92
 
15.5%
Space Separator 9
 
1.5%
Decimal Number 1
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 82
16.7%
o 77
15.7%
n 45
9.2%
i 41
8.4%
e 38
7.8%
r 34
 
6.9%
t 27
 
5.5%
l 27
 
5.5%
s 18
 
3.7%
u 16
 
3.3%
Other values (12) 85
17.3%
Uppercase Letter
ValueCountFrequency (%)
T 15
16.3%
N 14
15.2%
C 9
9.8%
K 7
 
7.6%
S 7
 
7.6%
A 5
 
5.4%
M 5
 
5.4%
H 4
 
4.3%
D 4
 
4.3%
R 3
 
3.3%
Other values (12) 19
20.7%
Space Separator
ValueCountFrequency (%)
9
100.0%
Decimal Number
ValueCountFrequency (%)
4 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 582
98.3%
Common 10
 
1.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 82
14.1%
o 77
13.2%
n 45
 
7.7%
i 41
 
7.0%
e 38
 
6.5%
r 34
 
5.8%
t 27
 
4.6%
l 27
 
4.6%
s 18
 
3.1%
u 16
 
2.7%
Other values (34) 177
30.4%
Common
ValueCountFrequency (%)
9
90.0%
4 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 592
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 82
13.9%
o 77
13.0%
n 45
 
7.6%
i 41
 
6.9%
e 38
 
6.4%
r 34
 
5.7%
t 27
 
4.6%
l 27
 
4.6%
s 18
 
3.0%
u 16
 
2.7%
Other values (36) 187
31.6%

species
Categorical

HIGH CORRELATION 

Distinct39
Distinct (%)44.8%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
Human
38 
Droid
Gungan
 
3
Mirialan
 
2
Wookiee
 
2
Other values (34)
36 

Length

Max length16
Median length5
Mean length6.2528736
Min length3

Characters and Unicode

Total characters544
Distinct characters45
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)36.8%

Sample

1st rowHuman
2nd rowDroid
3rd rowDroid
4th rowHuman
5th rowHuman

Common Values

ValueCountFrequency (%)
Human 38
43.7%
Droid 6
 
6.9%
Gungan 3
 
3.4%
Mirialan 2
 
2.3%
Wookiee 2
 
2.3%
Twi'lek 2
 
2.3%
Kaminoan 2
 
2.3%
Neimodian 1
 
1.1%
Hutt 1
 
1.1%
Yoda's species 1
 
1.1%
Other values (29) 29
33.3%

Length

2023-12-30T09:20:16.653992image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
human 38
41.8%
droid 6
 
6.6%
gungan 3
 
3.3%
mirialan 2
 
2.2%
wookiee 2
 
2.2%
twi'lek 2
 
2.2%
kaminoan 2
 
2.2%
zabrak 2
 
2.2%
besalisk 1
 
1.1%
tholothian 1
 
1.1%
Other values (32) 32
35.2%

Most occurring characters

ValueCountFrequency (%)
a 84
15.4%
n 73
13.4%
u 52
9.6%
m 44
 
8.1%
H 39
 
7.2%
o 32
 
5.9%
i 31
 
5.7%
e 24
 
4.4%
r 21
 
3.9%
l 15
 
2.8%
Other values (35) 129
23.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 446
82.0%
Uppercase Letter 90
 
16.5%
Space Separator 4
 
0.7%
Other Punctuation 4
 
0.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
H 39
43.3%
D 8
 
8.9%
T 7
 
7.8%
M 4
 
4.4%
C 4
 
4.4%
K 4
 
4.4%
G 4
 
4.4%
N 2
 
2.2%
S 2
 
2.2%
I 2
 
2.2%
Other values (12) 14
 
15.6%
Lowercase Letter
ValueCountFrequency (%)
a 84
18.8%
n 73
16.4%
u 52
11.7%
m 44
9.9%
o 32
 
7.2%
i 31
 
7.0%
e 24
 
5.4%
r 21
 
4.7%
l 15
 
3.4%
d 13
 
2.9%
Other values (11) 57
12.8%
Space Separator
ValueCountFrequency (%)
4
100.0%
Other Punctuation
ValueCountFrequency (%)
' 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 536
98.5%
Common 8
 
1.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 84
15.7%
n 73
13.6%
u 52
9.7%
m 44
 
8.2%
H 39
 
7.3%
o 32
 
6.0%
i 31
 
5.8%
e 24
 
4.5%
r 21
 
3.9%
l 15
 
2.8%
Other values (33) 121
22.6%
Common
ValueCountFrequency (%)
4
50.0%
' 4
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 544
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 84
15.4%
n 73
13.4%
u 52
9.6%
m 44
 
8.1%
H 39
 
7.2%
o 32
 
5.9%
i 31
 
5.7%
e 24
 
4.4%
r 21
 
3.9%
l 15
 
2.8%
Other values (35) 129
23.7%
Distinct51
Distinct (%)58.6%
Missing0
Missing (%)0.0%
Memory size6.5 KiB
2023-12-30T09:20:16.821127image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length42
Median length29
Mean length16.977011
Min length4

Characters and Unicode

Total characters1477
Distinct characters51
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40 ?
Unique (%)46.0%

Sample

1st rowJedi Master
2nd rowProtocol droid
3rd rowAstromech droid
4th rowDark Lord of the Sith
5th rowPrincess of Alderaan
ValueCountFrequency (%)
jedi 20
 
8.5%
of 19
 
8.1%
the 15
 
6.4%
master 12
 
5.1%
pilot 9
 
3.8%
hunter 6
 
2.5%
droid 6
 
2.5%
general 5
 
2.1%
alliance 5
 
2.1%
podracer 5
 
2.1%
Other values (90) 134
56.8%
2023-12-30T09:20:17.079558image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 164
 
11.1%
151
 
10.2%
r 115
 
7.8%
i 110
 
7.4%
t 100
 
6.8%
a 97
 
6.6%
o 96
 
6.5%
n 76
 
5.1%
d 64
 
4.3%
l 55
 
3.7%
Other values (41) 449
30.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1154
78.1%
Uppercase Letter 166
 
11.2%
Space Separator 151
 
10.2%
Dash Punctuation 2
 
0.1%
Other Punctuation 2
 
0.1%
Decimal Number 2
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 164
14.2%
r 115
10.0%
i 110
9.5%
t 100
8.7%
a 97
8.4%
o 96
8.3%
n 76
 
6.6%
d 64
 
5.5%
l 55
 
4.8%
s 43
 
3.7%
Other values (14) 234
20.3%
Uppercase Letter
ValueCountFrequency (%)
J 22
13.3%
C 21
12.7%
M 18
10.8%
A 17
10.2%
P 16
9.6%
G 11
 
6.6%
S 11
 
6.6%
B 8
 
4.8%
R 7
 
4.2%
H 6
 
3.6%
Other values (12) 29
17.5%
Decimal Number
ValueCountFrequency (%)
9 1
50.0%
0 1
50.0%
Space Separator
ValueCountFrequency (%)
151
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Other Punctuation
ValueCountFrequency (%)
' 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1320
89.4%
Common 157
 
10.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 164
12.4%
r 115
 
8.7%
i 110
 
8.3%
t 100
 
7.6%
a 97
 
7.3%
o 96
 
7.3%
n 76
 
5.8%
d 64
 
4.8%
l 55
 
4.2%
s 43
 
3.3%
Other values (36) 400
30.3%
Common
ValueCountFrequency (%)
151
96.2%
- 2
 
1.3%
' 2
 
1.3%
9 1
 
0.6%
0 1
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1476
99.9%
None 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 164
 
11.1%
151
 
10.2%
r 115
 
7.8%
i 110
 
7.5%
t 100
 
6.8%
a 97
 
6.6%
o 96
 
6.5%
n 76
 
5.1%
d 64
 
4.3%
l 55
 
3.7%
Other values (40) 448
30.4%
None
ValueCountFrequency (%)
é 1
100.0%

cybernetics
Text

MISSING 

Distinct7
Distinct (%)100.0%
Missing80
Missing (%)92.0%
Memory size3.3 KiB
2023-12-30T09:20:17.202720image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length45
Median length38
Mean length30
Min length20

Characters and Unicode

Total characters210
Distinct characters34
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rowProsthetic right hand
2nd rowProsthetic arms and legs, life-support system
3rd rowCybernetic right arm
4th rowAJ^6 cyborg construct
5th rowSix-legged apparatus, Two cybernetic legs
ValueCountFrequency (%)
prosthetic 2
 
7.4%
legs 2
 
7.4%
cybernetic 2
 
7.4%
right 2
 
7.4%
six-legged 1
 
3.7%
for 1
 
3.7%
except 1
 
3.7%
cebernetic 1
 
3.7%
completely 1
 
3.7%
annunciator 1
 
3.7%
Other values (13) 13
48.1%
2023-12-30T09:20:17.386586image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 19
 
9.0%
t 18
 
8.6%
18
 
8.6%
r 17
 
8.1%
c 13
 
6.2%
a 11
 
5.2%
o 11
 
5.2%
i 11
 
5.2%
s 10
 
4.8%
n 10
 
4.8%
Other values (24) 72
34.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 173
82.4%
Space Separator 20
 
9.5%
Uppercase Letter 10
 
4.8%
Other Punctuation 3
 
1.4%
Dash Punctuation 2
 
1.0%
Modifier Symbol 1
 
0.5%
Decimal Number 1
 
0.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 19
11.0%
t 18
 
10.4%
r 17
 
9.8%
c 13
 
7.5%
a 11
 
6.4%
o 11
 
6.4%
i 11
 
6.4%
s 10
 
5.8%
n 10
 
5.8%
l 7
 
4.0%
Other values (11) 46
26.6%
Uppercase Letter
ValueCountFrequency (%)
A 2
20.0%
C 2
20.0%
P 2
20.0%
J 1
10.0%
S 1
10.0%
T 1
10.0%
V 1
10.0%
Space Separator
ValueCountFrequency (%)
18
90.0%
  2
 
10.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 1
100.0%
Decimal Number
ValueCountFrequency (%)
6 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 183
87.1%
Common 27
 
12.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 19
 
10.4%
t 18
 
9.8%
r 17
 
9.3%
c 13
 
7.1%
a 11
 
6.0%
o 11
 
6.0%
i 11
 
6.0%
s 10
 
5.5%
n 10
 
5.5%
l 7
 
3.8%
Other values (18) 56
30.6%
Common
ValueCountFrequency (%)
18
66.7%
, 3
 
11.1%
- 2
 
7.4%
  2
 
7.4%
^ 1
 
3.7%
6 1
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 208
99.0%
None 2
 
1.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 19
 
9.1%
t 18
 
8.7%
18
 
8.7%
r 17
 
8.2%
c 13
 
6.2%
a 11
 
5.3%
o 11
 
5.3%
i 11
 
5.3%
s 10
 
4.8%
n 10
 
4.8%
Other values (23) 70
33.7%
None
ValueCountFrequency (%)
  2
100.0%

abilities
Categorical

HIGH CORRELATION  MISSING 

Distinct25
Distinct (%)45.5%
Missing32
Missing (%)36.8%
Memory size5.6 KiB
Lightsaber abilities, Force powers
11 
Piloting
10 
Piloting, Racing
Politics
Lightsaber training, Force powers, Other abilities
Other values (20)
22 

Length

Max length53
Median length50
Mean length25.509091
Min length8

Characters and Unicode

Total characters1403
Distinct characters38
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)32.7%

Sample

1st rowLightsaber abilities, Force powers, Other abilities
2nd rowLanguage known, Other skills
3rd rowLightsaber abilities, Force powers, Language known
4th rowJedi training, Force powers, Other abilities
5th rowBlaster abilities

Common Values

ValueCountFrequency (%)
Lightsaber abilities, Force powers 11
 
12.6%
Piloting 10
 
11.5%
Piloting, Racing 5
 
5.7%
Politics 4
 
4.6%
Lightsaber training, Force powers, Other abilities 3
 
3.4%
Lightsaber training, Force powers 2
 
2.3%
Lightsaber abilities, Force powers, Force lightning 2
 
2.3%
Force sensitivy 1
 
1.1%
Lightsaber abilities, Force powers, Language known 1
 
1.1%
Jedi training, Force powers, Other abilities 1
 
1.1%
Other values (15) 15
17.2%
(Missing) 32
36.8%

Length

2023-12-30T09:20:17.464501image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
force 29
17.3%
abilities 25
14.9%
lightsaber 24
14.3%
powers 22
13.1%
piloting 17
10.1%
training 9
 
5.4%
other 8
 
4.8%
racing 5
 
3.0%
politics 5
 
3.0%
lightning 3
 
1.8%
Other values (18) 21
12.5%

Most occurring characters

ValueCountFrequency (%)
i 185
13.2%
e 121
 
8.6%
113
 
8.1%
t 102
 
7.3%
r 100
 
7.1%
s 87
 
6.2%
o 79
 
5.6%
a 78
 
5.6%
g 68
 
4.8%
n 61
 
4.3%
Other values (28) 409
29.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1142
81.4%
Space Separator 113
 
8.1%
Uppercase Letter 101
 
7.2%
Other Punctuation 45
 
3.2%
Dash Punctuation 2
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 185
16.2%
e 121
10.6%
t 102
8.9%
r 100
8.8%
s 87
7.6%
o 79
 
6.9%
a 78
 
6.8%
g 68
 
6.0%
n 61
 
5.3%
l 58
 
5.1%
Other values (12) 203
17.8%
Uppercase Letter
ValueCountFrequency (%)
F 29
28.7%
L 26
25.7%
P 22
21.8%
O 8
 
7.9%
R 5
 
5.0%
S 3
 
3.0%
B 2
 
2.0%
T 1
 
1.0%
K 1
 
1.0%
D 1
 
1.0%
Other values (3) 3
 
3.0%
Space Separator
ValueCountFrequency (%)
113
100.0%
Other Punctuation
ValueCountFrequency (%)
, 45
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1243
88.6%
Common 160
 
11.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 185
14.9%
e 121
 
9.7%
t 102
 
8.2%
r 100
 
8.0%
s 87
 
7.0%
o 79
 
6.4%
a 78
 
6.3%
g 68
 
5.5%
n 61
 
4.9%
l 58
 
4.7%
Other values (25) 304
24.5%
Common
ValueCountFrequency (%)
113
70.6%
, 45
 
28.1%
- 2
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1403
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 185
13.2%
e 121
 
8.6%
113
 
8.1%
t 102
 
7.3%
r 100
 
7.1%
s 87
 
6.2%
o 79
 
5.6%
a 78
 
5.6%
g 68
 
4.8%
n 61
 
4.3%
Other values (28) 409
29.2%

equipment
Text

MISSING 

Distinct31
Distinct (%)50.0%
Missing25
Missing (%)28.7%
Memory size6.0 KiB
2023-12-30T09:20:17.665338image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length180
Median length124.5
Mean length27.370968
Min length7

Characters and Unicode

Total characters1697
Distinct characters59
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)41.9%

Sample

1st rowLightsabers, Blasters
2nd rowBuzz saw, Electric pike, Drinks tray, Fusion welder, Scomp link, Power recharge coupler, Rocket boosters, Holographic projector, Motorized all-terrain treads, Retractable third leg
3rd rowClothing, Lightsabers
4th rowDefender sporting blaster pistol, X-30 Lancer target blast pistol, Lightsaber
5th rowBlasters, SX-14 Field Hover-Ute, GX-8 Moisture Vaporators, Droids
ValueCountFrequency (%)
lightsabers 24
 
11.8%
clothing 13
 
6.4%
blasters 8
 
3.9%
weapons 7
 
3.4%
armor 4
 
2.0%
helmet 4
 
2.0%
flight 3
 
1.5%
belt 2
 
1.0%
utility 2
 
1.0%
tools 2
 
1.0%
Other values (118) 135
66.2%
2023-12-30T09:20:17.957243image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 144
 
8.5%
142
 
8.4%
s 127
 
7.5%
r 123
 
7.2%
t 110
 
6.5%
o 107
 
6.3%
a 99
 
5.8%
i 98
 
5.8%
l 82
 
4.8%
n 71
 
4.2%
Other values (49) 594
35.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1312
77.3%
Uppercase Letter 150
 
8.8%
Space Separator 142
 
8.4%
Other Punctuation 71
 
4.2%
Decimal Number 12
 
0.7%
Dash Punctuation 10
 
0.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 144
11.0%
s 127
9.7%
r 123
9.4%
t 110
 
8.4%
o 107
 
8.2%
a 99
 
7.5%
i 98
 
7.5%
l 82
 
6.2%
n 71
 
5.4%
h 65
 
5.0%
Other values (15) 286
21.8%
Uppercase Letter
ValueCountFrequency (%)
L 29
19.3%
C 17
11.3%
H 11
 
7.3%
B 10
 
6.7%
A 9
 
6.0%
W 9
 
6.0%
R 8
 
5.3%
M 7
 
4.7%
G 7
 
4.7%
F 6
 
4.0%
Other values (12) 37
24.7%
Decimal Number
ValueCountFrequency (%)
4 3
25.0%
8 2
16.7%
3 2
16.7%
0 2
16.7%
9 1
 
8.3%
7 1
 
8.3%
1 1
 
8.3%
Other Punctuation
ValueCountFrequency (%)
, 68
95.8%
/ 2
 
2.8%
. 1
 
1.4%
Space Separator
ValueCountFrequency (%)
142
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1462
86.2%
Common 235
 
13.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 144
 
9.8%
s 127
 
8.7%
r 123
 
8.4%
t 110
 
7.5%
o 107
 
7.3%
a 99
 
6.8%
i 98
 
6.7%
l 82
 
5.6%
n 71
 
4.9%
h 65
 
4.4%
Other values (37) 436
29.8%
Common
ValueCountFrequency (%)
142
60.4%
, 68
28.9%
- 10
 
4.3%
4 3
 
1.3%
8 2
 
0.9%
3 2
 
0.9%
0 2
 
0.9%
/ 2
 
0.9%
9 1
 
0.4%
. 1
 
0.4%
Other values (2) 2
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1697
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 144
 
8.5%
142
 
8.4%
s 127
 
7.5%
r 123
 
7.2%
t 110
 
6.5%
o 107
 
6.3%
a 99
 
5.8%
i 98
 
5.8%
l 82
 
4.8%
n 71
 
4.2%
Other values (49) 594
35.0%

films
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)27.6%
Missing0
Missing (%)0.0%
Memory size8.2 KiB
Attack of the Clones
13 
The Phantom Menace
13 
The Phantom Menace, Attack of the Clones, Revenge of the Sith
Attack of the Clones, Revenge of the Sith
The Force Awakens
Other values (19)
41 

Length

Max length137
Median length95
Mean length38.218391
Min length10

Characters and Unicode

Total characters3325
Distinct characters35
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)9.2%

Sample

1st rowA New Hope, The Empire Strikes Back, Return of the Jedi, Revenge of the Sith, The Force Awakens
2nd rowA New Hope, The Empire Strikes Back, Return of the Jedi, The Phantom Menace, Attack of the Clones, Revenge of the Sith
3rd rowA New Hope, The Empire Strikes Back, Return of the Jedi, The Phantom Menace, Attack of the Clones, Revenge of the Sith, The Force Awakens
4th rowA New Hope, The Empire Strikes Back, Return of the Jedi, Revenge of the Sith
5th rowA New Hope, The Empire Strikes Back, Return of the Jedi, Revenge of the Sith, The Force Awakens

Common Values

ValueCountFrequency (%)
Attack of the Clones 13
14.9%
The Phantom Menace 13
14.9%
The Phantom Menace, Attack of the Clones, Revenge of the Sith 8
 
9.2%
Attack of the Clones, Revenge of the Sith 7
 
8.0%
The Force Awakens 5
 
5.7%
Return of the Jedi 5
 
5.7%
A New Hope 4
 
4.6%
The Phantom Menace, Attack of the Clones 4
 
4.6%
The Empire Strikes Back 3
 
3.4%
Revenge of the Sith 3
 
3.4%
Other values (14) 22
25.3%

Length

2023-12-30T09:20:18.043844image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the 155
24.6%
of 94
14.9%
attack 40
 
6.4%
clones 40
 
6.4%
phantom 34
 
5.4%
menace 34
 
5.4%
revenge 34
 
5.4%
sith 34
 
5.4%
return 20
 
3.2%
jedi 20
 
3.2%
Other values (8) 124
19.7%

Most occurring characters

ValueCountFrequency (%)
542
16.3%
e 495
14.9%
t 278
 
8.4%
h 223
 
6.7%
o 197
 
5.9%
n 173
 
5.2%
a 135
 
4.1%
c 101
 
3.0%
f 94
 
2.8%
i 86
 
2.6%
Other values (25) 1001
30.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2256
67.8%
Space Separator 542
 
16.3%
Uppercase Letter 441
 
13.3%
Other Punctuation 86
 
2.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 495
21.9%
t 278
12.3%
h 223
9.9%
o 197
 
8.7%
n 173
 
7.7%
a 135
 
6.0%
c 101
 
4.5%
f 94
 
4.2%
i 86
 
3.8%
k 83
 
3.7%
Other values (10) 391
17.3%
Uppercase Letter
ValueCountFrequency (%)
A 69
15.6%
T 61
13.8%
R 54
12.2%
S 50
11.3%
C 40
9.1%
M 34
7.7%
P 34
7.7%
J 20
 
4.5%
N 18
 
4.1%
H 18
 
4.1%
Other values (3) 43
9.8%
Space Separator
ValueCountFrequency (%)
542
100.0%
Other Punctuation
ValueCountFrequency (%)
, 86
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2697
81.1%
Common 628
 
18.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 495
18.4%
t 278
 
10.3%
h 223
 
8.3%
o 197
 
7.3%
n 173
 
6.4%
a 135
 
5.0%
c 101
 
3.7%
f 94
 
3.5%
i 86
 
3.2%
k 83
 
3.1%
Other values (23) 832
30.8%
Common
ValueCountFrequency (%)
542
86.3%
, 86
 
13.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3325
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
542
16.3%
e 495
14.9%
t 278
 
8.4%
h 223
 
6.7%
o 197
 
5.9%
n 173
 
5.2%
a 135
 
4.1%
c 101
 
3.0%
f 94
 
2.8%
i 86
 
2.6%
Other values (25) 1001
30.1%

vehicles
Text

MISSING 

Distinct14
Distinct (%)93.3%
Missing72
Missing (%)82.8%
Memory size3.5 KiB
2023-12-30T09:20:18.148604image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length49
Median length26
Mean length20.8
Min length5

Characters and Unicode

Total characters312
Distinct characters48
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)86.7%

Sample

1st rowSnowspeeder, Imperial Speeder Bike
2nd rowImperial Speeder Bike
3rd rowZephyr-G swoop bike, V-35 Courier, T-16 Skyhopper
4th rowTribubble bongo
5th rowZephyr-G swoop bike, XJ-6 airspeeder
ValueCountFrequency (%)
bike 5
 
11.6%
speeder 4
 
9.3%
tribubble 2
 
4.7%
snowspeeder 2
 
4.7%
imperial 2
 
4.7%
zephyr-g 2
 
4.7%
swoop 2
 
4.7%
bongo 2
 
4.7%
airspeeder 2
 
4.7%
flitknot 1
 
2.3%
Other values (19) 19
44.2%
2023-12-30T09:20:18.320640image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 47
15.1%
28
 
9.0%
r 27
 
8.7%
o 19
 
6.1%
p 18
 
5.8%
i 16
 
5.1%
d 13
 
4.2%
b 11
 
3.5%
s 10
 
3.2%
a 9
 
2.9%
Other values (38) 114
36.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 224
71.8%
Uppercase Letter 36
 
11.5%
Space Separator 28
 
9.0%
Decimal Number 12
 
3.8%
Dash Punctuation 8
 
2.6%
Other Punctuation 4
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 47
21.0%
r 27
12.1%
o 19
8.5%
p 18
 
8.0%
i 16
 
7.1%
d 13
 
5.8%
b 11
 
4.9%
s 10
 
4.5%
a 9
 
4.0%
l 8
 
3.6%
Other values (12) 46
20.5%
Uppercase Letter
ValueCountFrequency (%)
T 7
19.4%
S 7
19.4%
B 3
8.3%
Z 2
 
5.6%
I 2
 
5.6%
P 2
 
5.6%
V 2
 
5.6%
G 2
 
5.6%
X 1
 
2.8%
J 1
 
2.8%
Other values (7) 7
19.4%
Decimal Number
ValueCountFrequency (%)
6 3
25.0%
3 3
25.0%
2 2
16.7%
1 2
16.7%
5 1
 
8.3%
7 1
 
8.3%
Space Separator
ValueCountFrequency (%)
28
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 260
83.3%
Common 52
 
16.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 47
18.1%
r 27
 
10.4%
o 19
 
7.3%
p 18
 
6.9%
i 16
 
6.2%
d 13
 
5.0%
b 11
 
4.2%
s 10
 
3.8%
a 9
 
3.5%
l 8
 
3.1%
Other values (29) 82
31.5%
Common
ValueCountFrequency (%)
28
53.8%
- 8
 
15.4%
, 4
 
7.7%
6 3
 
5.8%
3 3
 
5.8%
2 2
 
3.8%
1 2
 
3.8%
5 1
 
1.9%
7 1
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 312
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 47
15.1%
28
 
9.0%
r 27
 
8.7%
o 19
 
6.1%
p 18
 
5.8%
i 16
 
5.1%
d 13
 
4.2%
b 11
 
3.5%
s 10
 
3.2%
a 9
 
2.9%
Other values (38) 114
36.5%

starships
Text

MISSING 

Distinct15
Distinct (%)75.0%
Missing67
Missing (%)77.0%
Memory size3.8 KiB
2023-12-30T09:20:18.432363image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length104
Median length35
Mean length23.7
Min length6

Characters and Unicode

Total characters474
Distinct characters41
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)60.0%

Sample

1st rowX-wing, Imperial shuttle
2nd rowTIE Advanced x1
3rd rowX-wing
4th rowJedi starfighter, Trade Federation cruiser, Naboo star skiff, Jedi Interceptor, Belbullab-22 starfighter
5th rowNaboo fighter, Trade Federation cruiser, Jedi Interceptor
ValueCountFrequency (%)
naboo 6
 
9.7%
x-wing 5
 
8.1%
falcon 4
 
6.5%
jedi 4
 
6.5%
starfighter 4
 
6.5%
millennium 4
 
6.5%
imperial 3
 
4.8%
shuttle 3
 
4.8%
fighter 3
 
4.8%
belbullab-22 2
 
3.2%
Other values (18) 24
38.7%
2023-12-30T09:20:18.606845image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
42
 
8.9%
i 38
 
8.0%
e 38
 
8.0%
a 32
 
6.8%
r 30
 
6.3%
t 29
 
6.1%
l 26
 
5.5%
n 24
 
5.1%
o 21
 
4.4%
s 14
 
3.0%
Other values (31) 180
38.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 361
76.2%
Uppercase Letter 45
 
9.5%
Space Separator 42
 
8.9%
Other Punctuation 11
 
2.3%
Dash Punctuation 9
 
1.9%
Decimal Number 6
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 38
 
10.5%
e 38
 
10.5%
a 32
 
8.9%
r 30
 
8.3%
t 29
 
8.0%
l 26
 
7.2%
n 24
 
6.6%
o 21
 
5.8%
s 14
 
3.9%
g 13
 
3.6%
Other values (13) 96
26.6%
Uppercase Letter
ValueCountFrequency (%)
N 7
15.6%
I 6
13.3%
F 6
13.3%
X 5
11.1%
J 4
8.9%
M 4
8.9%
T 3
6.7%
S 3
6.7%
A 2
 
4.4%
B 2
 
4.4%
Other values (3) 3
6.7%
Decimal Number
ValueCountFrequency (%)
2 4
66.7%
1 2
33.3%
Space Separator
ValueCountFrequency (%)
42
100.0%
Other Punctuation
ValueCountFrequency (%)
, 11
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 406
85.7%
Common 68
 
14.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 38
 
9.4%
e 38
 
9.4%
a 32
 
7.9%
r 30
 
7.4%
t 29
 
7.1%
l 26
 
6.4%
n 24
 
5.9%
o 21
 
5.2%
s 14
 
3.4%
g 13
 
3.2%
Other values (26) 141
34.7%
Common
ValueCountFrequency (%)
42
61.8%
, 11
 
16.2%
- 9
 
13.2%
2 4
 
5.9%
1 2
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 474
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
42
 
8.9%
i 38
 
8.0%
e 38
 
8.0%
a 32
 
6.8%
r 30
 
6.3%
t 29
 
6.1%
l 26
 
5.5%
n 24
 
5.1%
o 21
 
4.4%
s 14
 
3.0%
Other values (31) 180
38.0%

photo
URL

UNIQUE 

Distinct87
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size11.1 KiB
https://static.wikia.nocookie.net/starwars/images/d/d9/Luke-rotjpromo.jpg
 
1
https://static.wikia.nocookie.net/starwars/images/9/96/Yarael_Poof.png
 
1
https://static.wikia.nocookie.net/starwars/images/a/a4/BarrissOffee-OP.png
 
1
https://static.wikia.nocookie.net/starwars/images/9/91/LuminaraUnduli-Encyclopedia.png
 
1
https://static.wikia.nocookie.net/starwars/images/9/93/Poggle_the_lesser_-_sw_card_trader.png
 
1
Other values (82)
82 
ValueCountFrequency (%)
https://static.wikia.nocookie.net/starwars/images/d/d9/Luke-rotjpromo.jpg 1
 
1.1%
https://static.wikia.nocookie.net/starwars/images/9/96/Yarael_Poof.png 1
 
1.1%
https://static.wikia.nocookie.net/starwars/images/a/a4/BarrissOffee-OP.png 1
 
1.1%
https://static.wikia.nocookie.net/starwars/images/9/91/LuminaraUnduli-Encyclopedia.png 1
 
1.1%
https://static.wikia.nocookie.net/starwars/images/9/93/Poggle_the_lesser_-_sw_card_trader.png 1
 
1.1%
https://static.wikia.nocookie.net/starwars/images/c/c1/ClieggLars-FF72.png 1
 
1.1%
https://static.wikia.nocookie.net/starwars/images/9/95/Corde-SWCTP.png 1
 
1.1%
https://static.wikia.nocookie.net/starwars/images/6/6f/GregarTypho-FF103.png 1
 
1.1%
https://static.wikia.nocookie.net/starwars/images/1/14/MasAmedda-BTAHE3.png 1
 
1.1%
https://static.wikia.nocookie.net/starwars/images/c/c4/Plo_Koon_TPM.png 1
 
1.1%
Other values (77) 77
88.5%
ValueCountFrequency (%)
https 87
100.0%
ValueCountFrequency (%)
static.wikia.nocookie.net 87
100.0%
ValueCountFrequency (%)
/starwars/images/d/d9/Luke-rotjpromo.jpg 1
 
1.1%
/starwars/images/9/96/Yarael_Poof.png 1
 
1.1%
/starwars/images/a/a4/BarrissOffee-OP.png 1
 
1.1%
/starwars/images/9/91/LuminaraUnduli-Encyclopedia.png 1
 
1.1%
/starwars/images/9/93/Poggle_the_lesser_-_sw_card_trader.png 1
 
1.1%
/starwars/images/c/c1/ClieggLars-FF72.png 1
 
1.1%
/starwars/images/9/95/Corde-SWCTP.png 1
 
1.1%
/starwars/images/6/6f/GregarTypho-FF103.png 1
 
1.1%
/starwars/images/1/14/MasAmedda-BTAHE3.png 1
 
1.1%
/starwars/images/c/c4/Plo_Koon_TPM.png 1
 
1.1%
Other values (77) 77
88.5%
ValueCountFrequency (%)
87
100.0%
ValueCountFrequency (%)
87
100.0%

Interactions

2023-12-30T09:20:13.260511image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:12.741407image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:12.917799image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:13.086184image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:13.302975image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:12.790555image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:12.959779image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:13.136246image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:13.344892image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:12.835128image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:13.005650image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:13.188148image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:13.380835image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:12.875880image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:13.045525image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-12-30T09:20:13.224803image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Correlations

2023-12-30T09:20:18.670938image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
abilitiesbirth_erabirth_yeardeath_eradeath_yeareye_colorfilmsgenderhair_colorheightmasspronounsexskin_colorspecies
abilities1.0000.000-0.0240.435-0.0170.3450.4460.1190.320-0.157-0.1110.1190.5420.1590.000
birth_era0.0001.0000.4690.000-0.3430.0000.7640.0000.1630.0880.2520.0000.0000.0000.000
birth_year-0.0240.4691.0000.170-0.2580.4410.2870.0000.2870.0990.1740.0000.5310.3800.738
death_era0.4350.0000.1701.000-0.1960.0000.7090.0000.2910.084-0.0980.0000.0600.0000.000
death_year-0.017-0.343-0.258-0.1961.0000.0000.4860.0000.247-0.072-0.2430.0000.0000.1000.089
eye_color0.3450.0000.4410.0000.0001.0000.0000.2210.284-0.066-0.0260.2210.3090.3600.437
films0.4460.7640.2870.7090.4860.0001.0000.0000.3140.111-0.0950.0000.5190.0540.000
gender0.1190.0000.0000.0000.0000.2210.0001.0000.0920.3070.4210.9660.9590.0000.000
hair_color0.3200.1630.2870.2910.2470.2840.3140.0921.0000.1890.0040.0920.0000.0000.000
height-0.1570.0880.0990.084-0.072-0.0660.1110.3070.1891.0000.7570.4180.3590.4420.576
mass-0.1110.2520.174-0.098-0.243-0.026-0.0950.4210.0040.7571.0000.0000.6860.7680.718
pronoun0.1190.0000.0000.0000.0000.2210.0000.9660.0920.4180.0001.0000.9590.0000.000
sex0.5420.0000.5310.0600.0000.3090.5190.9590.0000.3590.6860.9591.0000.6220.609
skin_color0.1590.0000.3800.0000.1000.3600.0540.0000.0000.4420.7680.0000.6221.0000.576
species0.0000.0000.7380.0000.0890.4370.0000.0000.0000.5760.7180.0000.6090.5761.000

Missing values

2023-12-30T09:20:13.450259image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-30T09:20:13.592473image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-30T09:20:13.708572image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

nameheightmasshair_colorskin_coloreye_colorbirth_yearbirth_erabirth_placedeath_yeardeath_eradeath_placesexgenderpronounhomeworldspeciesoccupationcyberneticsabilitiesequipmentfilmsvehiclesstarshipsphoto
0Luke Skywalker172.077.0blondlightblue19.0BBYPolis Massa34.0ABYAhch-Tomalemasculinehe/himTatooineHumanJedi MasterProsthetic right handLightsaber abilities, Force powers, Other abilitiesLightsabers, BlastersA New Hope, The Empire Strikes Back, Return of the Jedi, Revenge of the Sith, The Force AwakensSnowspeeder, Imperial Speeder BikeX-wing, Imperial shuttlehttps://static.wikia.nocookie.net/starwars/images/d/d9/Luke-rotjpromo.jpg
1C-3PO167.075.0NaNgoldyellow112.0BBYAffa3.0ABYBespinnonemasculinehe/himTatooineDroidProtocol droidNaNLanguage known, Other skillsNaNA New Hope, The Empire Strikes Back, Return of the Jedi, The Phantom Menace, Attack of the Clones, Revenge of the SithNaNNaNhttps://static.wikia.nocookie.net/starwars/images/5/51/C-3PO_EP3.png
2R2-D296.032.0NaNblue, silver, whitered33.0BBYNaN20.0BBYCaridanonemasculinehe/himNabooDroidAstromech droidNaNNaNBuzz saw, Electric pike, Drinks tray, Fusion welder, Scomp link, Power recharge coupler, Rocket boosters, Holographic projector, Motorized all-terrain treads, Retractable third legA New Hope, The Empire Strikes Back, Return of the Jedi, The Phantom Menace, Attack of the Clones, Revenge of the Sith, The Force AwakensNaNNaNhttps://static.wikia.nocookie.net/starwars/images/6/6d/R2D2-Chronicles.png
3Darth Vader202.0136.0sandy-blondpaleyellow41.0BBYTatooine4.0ABYDeath Star IImalemasculinehe/himTatooineHumanDark Lord of the SithProsthetic arms and legs, life-support systemLightsaber abilities, Force powers, Language knownClothing, LightsabersA New Hope, The Empire Strikes Back, Return of the Jedi, Revenge of the SithNaNTIE Advanced x1https://static.wikia.nocookie.net/starwars/images/a/a3/ANOVOS_Darth_Vader_1.png
4Leia Organa Solo150.049.0dark brownlightbrown19.0BBYPolis Massa35.0ABYAjan Klossfemalefeminineshe/herAlderaanHumanPrincess of AlderaanNaNJedi training, Force powers, Other abilitiesDefender sporting blaster pistol, X-30 Lancer target blast pistol, LightsaberA New Hope, The Empire Strikes Back, Return of the Jedi, Revenge of the Sith, The Force AwakensImperial Speeder BikeNaNhttps://static.wikia.nocookie.net/starwars/images/8/89/Leia_endorpromo02.jpg
5Owen Lars178.0120.0brownlightblue52.0BBYAtor0.0BBYTatooinemalemasculinehe/himTatooineHumanMoisture farmerNaNBlaster abilitiesBlasters, SX-14 Field Hover-Ute, GX-8 Moisture Vaporators, DroidsA New Hope, Attack of the Clones, Revenge of the SithZephyr-G swoop bike, V-35 Courier, T-16 SkyhopperNaNhttps://static.wikia.nocookie.net/starwars/images/9/91/OwenLarsHS-SWE.jpg
6Beru Whitesun Lars165.075.0brownlightblue47.0BBYNaN0.0BBYTatooinefemalefeminineshe/herTatooineHumanMoisture farmerNaNNaNLight robesA New Hope, Attack of the Clones, Revenge of the SithNaNNaNhttps://static.wikia.nocookie.net/starwars/images/7/76/Beru_headshot2.jpg
7R5-D497.032.0NaNwhite, red, blueredNaNNaNNaNNaNNaNNaNnonemasculinehe/himTatooineDroidAstromech droidNaNNaNHoloprojector, Rocket booster, Scomp linkA New HopeNaNNaNhttps://static.wikia.nocookie.net/starwars/images/2/2c/R5d4.jpg
8Biggs Darklighter183.084.0blacklightbrown24.0BBYTatooine0.0BBYYavin systemmalemasculinehe/himTatooineHumanPilotNaNPilotingNaNA New HopeNaNX-winghttps://static.wikia.nocookie.net/starwars/images/0/00/BiggsHS-ANH.png
9Obi-Wan Kenobi182.077.0auburnfairblue-gray57.0BBYStewjon0.0BBYDS-1 Orbital Battle Stationmalemasculinehe/himStewjonHumanJedi GeneralNaNLightsaber training, Force powers, Other abilitiesLightsabersA New Hope, The Empire Strikes Back, Return of the Jedi, The Phantom Menace, Attack of the Clones, Revenge of the SithTribubble bongoJedi starfighter, Trade Federation cruiser, Naboo star skiff, Jedi Interceptor, Belbullab-22 starfighterhttps://static.wikia.nocookie.net/starwars/images/7/74/OWK-SWFB.png
nameheightmasshair_colorskin_coloreye_colorbirth_yearbirth_erabirth_placedeath_yeardeath_eradeath_placesexgenderpronounhomeworldspeciesoccupationcyberneticsabilitiesequipmentfilmsvehiclesstarshipsphoto
77Grievous216.0159.0nonebrown, whitegoldNaNNaNNaN19.0BBYUtapaumalemasculinehe/himKaleeKaleeshJedi HunterCompletely cebernetic except for brainLightsaber abilitiesLightsabersRevenge of the SithTsmeu-6 personal wheel bikeBelbullab-22 starfighterhttps://static.wikia.nocookie.net/starwars/images/c/ca/Grievoushead-OP.png
78Tarfful234.0136.0brownbrownblueNaNNaNNaNNaNNaNNaNmalemasculinehe/himKashyyykWookieeWookiee chieftainNaNNaNWeaponsRevenge of the SithNaNNaNhttps://static.wikia.nocookie.net/starwars/images/3/37/Tarfful_RotS.png
79Raymus Antilles188.079.0brownlightbrownNaNNaNNaN0.0BBYTantive IVmalemasculinehe/himAlderaanHumanCaptain of the CR90 corvetteNaNPilotingNaNA New Hope, Revenge of the SithNaNNaNhttps://static.wikia.nocookie.net/starwars/images/8/82/RaymusAntilles-FFp46.png
80Sly Moore178.048.0nonepalewhiteNaNNaNNaN18.0BBYNaNfemalefeminineshe/herUmbaraUmbaranPersonal aideNaNPolitics, Force powersNaNAttack of the Clones, Revenge of the SithNaNNaNhttps://static.wikia.nocookie.net/starwars/images/b/b7/SlyMooreStare-OP.png
81Tion Medon206.080.0nonegray, redblackNaNNaNNaNNaNNaNNaNmalemasculinehe/himUtapauPau'anPort AdministratorNaNPoliticsClothingRevenge of the SithNaNNaNhttps://static.wikia.nocookie.net/starwars/images/c/c0/TionMedon-SS.png
82Finn178.073.0blackdarkbrown11.0ABYNaNNaNNaNNaNmalemasculinehe/himNaNHumanGeneral in the Rebel AllianceNaNNaNNaNThe Force AwakensNaNNaNhttps://static.wikia.nocookie.net/starwars/images/1/1a/Finn-TSWB.png
83Rey Skywalker170.054.0brownlighthazel15.0ABYHyperkarn35.0ABYExegolfemalefeminineshe/herJakkuHumanJediNaNLightsaber abilities, Force powers, Force lightningHellhound two, Vehicles, Tools and wapons, LightsabersThe Force AwakensNaNNaNhttps://static.wikia.nocookie.net/starwars/images/2/2b/Rey_TROS_Fathead.png
84Poe Dameron172.080.0dark browntanbrown2.0ABYYavin 4NaNNaNNaNmalemasculinehe/himYavin 4HumanGeneral in the Rebel AllianceNaNPilotingClothingThe Force AwakensNaNX-winghttps://static.wikia.nocookie.net/starwars/images/6/6b/PoeDameron-Heroes2023.png
85BB-867.018.0nonewhite, orangered29.0ABYNaNNaNNaNNaNnonemasculinehe/himHosnian PrimeDroidAstromech droidNaNNaNGrappling spike launcher, Welding torch, Holoprojector, Arc welderThe Force AwakensNaNNaNhttps://static.wikia.nocookie.net/starwars/images/6/68/BB8-Fathead.png
86Captain Phasma200.076.0goldlightblue6.0ABYParnassos34.0ABYCrait systemfemalefeminineshe/herParnassosHumanStormtrooper commanderNaNShootingRust-read war mask, Armor coated, Weapons, BlastersThe Force AwakensNaNNaNhttps://static.wikia.nocookie.net/starwars/images/0/02/Phasma.png