Who's comfortable with Wikidata as a concept?
Who's comfortable with Wikidata's underlying structure?
Who's comfortable looking at code examples?
Who's ready to start hacking in this session?
Answer BIG questions‽
The word for every language in every language?
Reminder. It's not just Wikipedias, but Commons, Wikisource, Wikivoyage (and more to come.)
What does the world look like according to data? Attribution Denny Vrandećič
"Wikidata data is complex."~ Markus Krötzsch
Each item includes:
These are the less semantic properties.
Properties. Prepare for the semantics.
So the triple reads:
[This item / page] [property] [value]
We can use the Javascript API - in fact that's what the interfact uses.
But what about if you want lots of data?
Then you just have undocumented XML dumps.
There are no real 'standard queries'.
So you could also imagine queries like, "which sources do I have to believe in order to beleive this statement."
Don't stifle creativity.
So in order to support "tree shaped regular conjunctive path queries", and "star-shaped" queries you get the freedom of the programming language rather than SQL-like chains.
go over all pages. If it has this property, bin it in two dimensions. By the way, I failed java classes twice in University, each time ending my formal education in programming.
I understand this is the worst way to do it.
Just need minimally to edit one class which is the ItemStatisticsProcessor in DumpProcessingExample
static class ItemStatisticsProcessor implements EntityDocumentProcessor {
long countItems = 0;
HashMap<String,Integer> lang_sexes = new HashMap<String,Integer>();
@Override
public void processItemDocument(ItemDocument itemDocument) {
this.countItems++;
for (StatementGroup sg : itemDocument.getStatementGroups()) {
for (Statement si: sg.getStatements()) {
String PID = si.getClaim().getMainSnak().getPropertyId().getId().toString();
if (PID.equals("P21")) {
for (String lang_string : itemDocument.getSiteLinks().keySet()) {
/* should do this a better way at some point*/
String ms = si.getClaim().getMainSnak().toString();
String[] parts = ms.split("http://www.wikidata.org/wiki/Wikidata:Main_Page/");
String VID = parts[2].substring(0, parts[2].length()-1);
String lang_sex_key = lang_string + "--" + VID;
if (this.lang_sexes.get(lang_sex_key) != null ) {
this.lang_sexes.put(lang_sex_key, this.lang_sexes.get(lang_sex_key) + 1 );
}
else{
this.lang_sexes.put(lang_sex_key, 1);
}
}
}
}
}
There's actually some more you need to edit to get the json out, but I'll let you see my document at this github link
Here's one I made earlier...
Start the live high-wire demo...
import json
from collections import defaultdict
import pandas as pd
import pywikibot
import decimal
NOPLACES = decimal.Decimal(10) ** 0
TWOPLACES = decimal.Decimal(10) ** -2
%pylab inline
Populating the interactive namespace from numpy and matplotlib
jsonfile = open('lang_sex.json','r')
bigdict = json.load(jsonfile)
lang_sex = defaultdict(dict)
for keystring, count in bigdict.iteritems():
lang, sex = keystring.split('--')
lang_sex[lang][sex] = count
sex_df = pd.DataFrame.from_dict(lang_sex, orient='index')
sex_df = sex_df.fillna(value=0.0)
sex_df
Q43445 | Q1097630 | Q746411 | Q639354 | Q1052281 | Q44148 | Q6581097 | Q6581072 | Q2449503 | Q48270 | Q8441 | Q658 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
abwiki | 0 | 0 | 0 | 0 | 0 | 0 | 58 | 7 | 0 | 0 | 0 | 0 |
acewiki | 0 | 0 | 0 | 0 | 0 | 0 | 179 | 34 | 0 | 0 | 0 | 0 |
afwiki | 0 | 0 | 0 | 1 | 0 | 1 | 3066 | 402 | 0 | 0 | 0 | 0 |
afwikiquote | 0 | 0 | 0 | 0 | 0 | 0 | 88 | 5 | 0 | 0 | 0 | 0 |
akwiki | 0 | 0 | 0 | 0 | 0 | 0 | 13 | 2 | 0 | 0 | 0 | 0 |
alswiki | 0 | 0 | 0 | 1 | 0 | 0 | 1567 | 196 | 0 | 0 | 0 | 0 |
amwiki | 0 | 0 | 0 | 0 | 0 | 0 | 483 | 56 | 0 | 0 | 0 | 0 |
angwiki | 0 | 0 | 0 | 0 | 0 | 0 | 205 | 52 | 0 | 0 | 0 | 0 |
anwiki | 0 | 0 | 0 | 0 | 0 | 0 | 2881 | 575 | 0 | 0 | 0 | 0 |
arcwiki | 0 | 0 | 0 | 0 | 0 | 0 | 90 | 12 | 0 | 0 | 0 | 0 |
arwiki | 1 | 1 | 0 | 1 | 2 | 2 | 21320 | 3868 | 0 | 1 | 0 | 0 |
arwikiquote | 0 | 0 | 0 | 0 | 0 | 0 | 154 | 11 | 0 | 0 | 0 | 0 |
arwikisource | 0 | 0 | 0 | 0 | 0 | 0 | 15 | 2 | 0 | 0 | 0 | 0 |
arzwiki | 0 | 0 | 0 | 0 | 0 | 0 | 1692 | 682 | 1 | 0 | 0 | 0 |
astwiki | 0 | 0 | 0 | 0 | 0 | 0 | 1224 | 228 | 0 | 0 | 0 | 0 |
aswiki | 0 | 0 | 0 | 0 | 0 | 0 | 175 | 51 | 0 | 0 | 0 | 0 |
avwiki | 0 | 0 | 0 | 0 | 0 | 0 | 29 | 0 | 0 | 0 | 0 | 0 |
aywiki | 0 | 0 | 0 | 0 | 0 | 0 | 187 | 17 | 0 | 0 | 0 | 0 |
azwiki | 1 | 1 | 0 | 0 | 1 | 1 | 4747 | 1002 | 0 | 0 | 0 | 0 |
azwikiquote | 0 | 0 | 0 | 0 | 0 | 0 | 454 | 30 | 0 | 0 | 0 | 0 |
azwikisource | 0 | 0 | 0 | 0 | 0 | 0 | 6 | 0 | 0 | 0 | 0 | 0 |
barwiki | 0 | 0 | 0 | 0 | 0 | 0 | 836 | 134 | 0 | 0 | 0 | 0 |
bat_smgwiki | 0 | 1 | 0 | 0 | 0 | 0 | 638 | 123 | 0 | 0 | 0 | 0 |
bawiki | 0 | 0 | 0 | 0 | 0 | 0 | 337 | 31 | 0 | 0 | 0 | 0 |
bclwiki | 0 | 1 | 0 | 0 | 0 | 0 | 414 | 74 | 0 | 0 | 0 | 0 |
be_x_oldwiki | 0 | 0 | 0 | 1 | 0 | 1 | 5179 | 724 | 0 | 0 | 0 | 0 |
bewiki | 0 | 0 | 0 | 1 | 0 | 1 | 8817 | 1240 | 0 | 0 | 0 | 0 |
bewikiquote | 0 | 0 | 0 | 0 | 0 | 0 | 31 | 0 | 0 | 0 | 0 | 0 |
bewikisource | 0 | 0 | 0 | 0 | 0 | 0 | 20 | 1 | 0 | 0 | 0 | 0 |
bgwiki | 2 | 1 | 0 | 1 | 2 | 4 | 18512 | 3536 | 0 | 0 | 0 | 0 |
bgwikiquote | 0 | 0 | 0 | 1 | 0 | 0 | 1592 | 334 | 0 | 0 | 0 | 0 |
bgwikisource | 0 | 0 | 0 | 0 | 0 | 0 | 22 | 2 | 0 | 0 | 0 | 0 |
bhwiki | 0 | 0 | 0 | 0 | 0 | 0 | 21 | 3 | 0 | 0 | 0 | 0 |
biwiki | 0 | 0 | 0 | 0 | 0 | 0 | 37 | 10 | 0 | 0 | 0 | 0 |
bjnwiki | 0 | 0 | 0 | 0 | 0 | 0 | 26 | 14 | 0 | 0 | 0 | 0 |
bmwiki | 0 | 0 | 0 | 0 | 0 | 0 | 21 | 3 | 0 | 0 | 0 | 0 |
bnwiki | 1 | 0 | 0 | 1 | 1 | 0 | 3515 | 676 | 0 | 0 | 0 | 0 |
bnwikisource | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 0 |
bowiki | 0 | 0 | 0 | 0 | 0 | 0 | 316 | 44 | 0 | 0 | 0 | 0 |
bpywiki | 0 | 0 | 0 | 0 | 0 | 0 | 57 | 14 | 0 | 0 | 0 | 0 |
brwiki | 0 | 0 | 0 | 1 | 0 | 0 | 4342 | 1438 | 0 | 0 | 0 | 0 |
brwikiquote | 0 | 0 | 0 | 0 | 0 | 0 | 52 | 2 | 0 | 0 | 0 | 0 |
brwikisource | 0 | 0 | 0 | 0 | 0 | 0 | 19 | 2 | 0 | 0 | 0 | 0 |
bswiki | 1 | 0 | 0 | 0 | 0 | 2 | 3558 | 610 | 0 | 0 | 0 | 0 |
bswikiquote | 0 | 0 | 0 | 0 | 0 | 0 | 994 | 147 | 0 | 0 | 0 | 0 |
bswikisource | 0 | 0 | 0 | 0 | 0 | 0 | 6 | 0 | 0 | 0 | 0 | 0 |
bugwiki | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 3 | 0 | 0 | 0 | 0 |
bxrwiki | 0 | 0 | 0 | 0 | 0 | 0 | 109 | 9 | 0 | 0 | 0 | 0 |
cawiki | 1 | 0 | 0 | 1 | 4 | 5 | 33846 | 5005 | 1 | 0 | 0 | 0 |
cawikiquote | 0 | 0 | 0 | 0 | 0 | 0 | 534 | 59 | 0 | 0 | 0 | 0 |
cawikisource | 0 | 0 | 0 | 0 | 0 | 0 | 174 | 6 | 0 | 0 | 0 | 0 |
cbk_zamwiki | 0 | 0 | 0 | 0 | 0 | 0 | 119 | 47 | 0 | 0 | 0 | 0 |
cdowiki | 0 | 0 | 0 | 0 | 0 | 0 | 36 | 4 | 0 | 0 | 0 | 0 |
cebwiki | 0 | 0 | 0 | 0 | 0 | 0 | 404 | 72 | 0 | 0 | 0 | 0 |
cewiki | 0 | 0 | 0 | 0 | 0 | 0 | 249 | 16 | 0 | 0 | 0 | 0 |
chrwiki | 0 | 0 | 0 | 0 | 0 | 0 | 19 | 18 | 0 | 0 | 0 | 0 |
chwiki | 0 | 0 | 0 | 0 | 0 | 0 | 11 | 1 | 0 | 0 | 0 | 0 |
chywiki | 0 | 0 | 0 | 0 | 0 | 0 | 23 | 9 | 0 | 0 | 0 | 0 |
ckbwiki | 0 | 1 | 0 | 0 | 1 | 0 | 1520 | 180 | 0 | 0 | 0 | 0 |
commonswiki | 1 | 0 | 0 | 1 | 2 | 6 | 18146 | 4221 | 1 | 1 | 0 | 0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
396 rows × 12 columns
#norm_sex is joke on heteronormativity
norm_sex = sex_df.apply(lambda row: row / row.sum(), axis=1)
norm_sex
Q43445 | Q1097630 | Q746411 | Q639354 | Q1052281 | Q44148 | Q6581097 | Q6581072 | Q2449503 | Q48270 | Q8441 | Q658 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
abwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.892308 | 0.107692 | 0.000000 | 0.000000 | 0 | 0 |
acewiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.840376 | 0.159624 | 0.000000 | 0.000000 | 0 | 0 |
afwiki | 0.000000 | 0.000000 | 0 | 0.000288 | 0.000000 | 0.000288 | 0.883573 | 0.115850 | 0.000000 | 0.000000 | 0 | 0 |
afwikiquote | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.946237 | 0.053763 | 0.000000 | 0.000000 | 0 | 0 |
akwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.866667 | 0.133333 | 0.000000 | 0.000000 | 0 | 0 |
alswiki | 0.000000 | 0.000000 | 0 | 0.000567 | 0.000000 | 0.000000 | 0.888322 | 0.111111 | 0.000000 | 0.000000 | 0 | 0 |
amwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.896104 | 0.103896 | 0.000000 | 0.000000 | 0 | 0 |
angwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.797665 | 0.202335 | 0.000000 | 0.000000 | 0 | 0 |
anwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.833623 | 0.166377 | 0.000000 | 0.000000 | 0 | 0 |
arcwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.882353 | 0.117647 | 0.000000 | 0.000000 | 0 | 0 |
arwiki | 0.000040 | 0.000040 | 0 | 0.000040 | 0.000079 | 0.000079 | 0.846166 | 0.153516 | 0.000000 | 0.000040 | 0 | 0 |
arwikiquote | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.933333 | 0.066667 | 0.000000 | 0.000000 | 0 | 0 |
arwikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.882353 | 0.117647 | 0.000000 | 0.000000 | 0 | 0 |
arzwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.712421 | 0.287158 | 0.000421 | 0.000000 | 0 | 0 |
astwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.842975 | 0.157025 | 0.000000 | 0.000000 | 0 | 0 |
aswiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.774336 | 0.225664 | 0.000000 | 0.000000 | 0 | 0 |
avwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 0.000000 | 0.000000 | 0.000000 | 0 | 0 |
aywiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.916667 | 0.083333 | 0.000000 | 0.000000 | 0 | 0 |
azwiki | 0.000174 | 0.000174 | 0 | 0.000000 | 0.000174 | 0.000174 | 0.825135 | 0.174170 | 0.000000 | 0.000000 | 0 | 0 |
azwikiquote | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.938017 | 0.061983 | 0.000000 | 0.000000 | 0 | 0 |
azwikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 0.000000 | 0.000000 | 0.000000 | 0 | 0 |
barwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.861856 | 0.138144 | 0.000000 | 0.000000 | 0 | 0 |
bat_smgwiki | 0.000000 | 0.001312 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.837270 | 0.161417 | 0.000000 | 0.000000 | 0 | 0 |
bawiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.915761 | 0.084239 | 0.000000 | 0.000000 | 0 | 0 |
bclwiki | 0.000000 | 0.002045 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.846626 | 0.151329 | 0.000000 | 0.000000 | 0 | 0 |
be_x_oldwiki | 0.000000 | 0.000000 | 0 | 0.000169 | 0.000000 | 0.000169 | 0.877053 | 0.122608 | 0.000000 | 0.000000 | 0 | 0 |
bewiki | 0.000000 | 0.000000 | 0 | 0.000099 | 0.000000 | 0.000099 | 0.876528 | 0.123273 | 0.000000 | 0.000000 | 0 | 0 |
bewikiquote | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 0.000000 | 0.000000 | 0.000000 | 0 | 0 |
bewikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.952381 | 0.047619 | 0.000000 | 0.000000 | 0 | 0 |
bgwiki | 0.000091 | 0.000045 | 0 | 0.000045 | 0.000091 | 0.000181 | 0.839242 | 0.160305 | 0.000000 | 0.000000 | 0 | 0 |
bgwikiquote | 0.000000 | 0.000000 | 0 | 0.000519 | 0.000000 | 0.000000 | 0.826155 | 0.173326 | 0.000000 | 0.000000 | 0 | 0 |
bgwikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.916667 | 0.083333 | 0.000000 | 0.000000 | 0 | 0 |
bhwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.875000 | 0.125000 | 0.000000 | 0.000000 | 0 | 0 |
biwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.787234 | 0.212766 | 0.000000 | 0.000000 | 0 | 0 |
bjnwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.650000 | 0.350000 | 0.000000 | 0.000000 | 0 | 0 |
bmwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.875000 | 0.125000 | 0.000000 | 0.000000 | 0 | 0 |
bnwiki | 0.000238 | 0.000000 | 0 | 0.000238 | 0.000238 | 0.000000 | 0.838102 | 0.161183 | 0.000000 | 0.000000 | 0 | 0 |
bnwikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 0.000000 | 0.000000 | 0.000000 | 0 | 0 |
bowiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.877778 | 0.122222 | 0.000000 | 0.000000 | 0 | 0 |
bpywiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.802817 | 0.197183 | 0.000000 | 0.000000 | 0 | 0 |
brwiki | 0.000000 | 0.000000 | 0 | 0.000173 | 0.000000 | 0.000000 | 0.751081 | 0.248746 | 0.000000 | 0.000000 | 0 | 0 |
brwikiquote | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.962963 | 0.037037 | 0.000000 | 0.000000 | 0 | 0 |
brwikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.904762 | 0.095238 | 0.000000 | 0.000000 | 0 | 0 |
bswiki | 0.000240 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000480 | 0.853033 | 0.146248 | 0.000000 | 0.000000 | 0 | 0 |
bswikiquote | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.871166 | 0.128834 | 0.000000 | 0.000000 | 0 | 0 |
bswikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 0.000000 | 0.000000 | 0.000000 | 0 | 0 |
bugwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.250000 | 0.750000 | 0.000000 | 0.000000 | 0 | 0 |
bxrwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.923729 | 0.076271 | 0.000000 | 0.000000 | 0 | 0 |
cawiki | 0.000026 | 0.000000 | 0 | 0.000026 | 0.000103 | 0.000129 | 0.870905 | 0.128786 | 0.000026 | 0.000000 | 0 | 0 |
cawikiquote | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.900506 | 0.099494 | 0.000000 | 0.000000 | 0 | 0 |
cawikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.966667 | 0.033333 | 0.000000 | 0.000000 | 0 | 0 |
cbk_zamwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.716867 | 0.283133 | 0.000000 | 0.000000 | 0 | 0 |
cdowiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.900000 | 0.100000 | 0.000000 | 0.000000 | 0 | 0 |
cebwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.848739 | 0.151261 | 0.000000 | 0.000000 | 0 | 0 |
cewiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.939623 | 0.060377 | 0.000000 | 0.000000 | 0 | 0 |
chrwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.513514 | 0.486486 | 0.000000 | 0.000000 | 0 | 0 |
chwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.916667 | 0.083333 | 0.000000 | 0.000000 | 0 | 0 |
chywiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.718750 | 0.281250 | 0.000000 | 0.000000 | 0 | 0 |
ckbwiki | 0.000000 | 0.000588 | 0 | 0.000000 | 0.000588 | 0.000000 | 0.893067 | 0.105758 | 0.000000 | 0.000000 | 0 | 0 |
commonswiki | 0.000045 | 0.000000 | 0 | 0.000045 | 0.000089 | 0.000268 | 0.810849 | 0.188614 | 0.000045 | 0.000045 | 0 | 0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
396 rows × 12 columns
#Tranforming QIDs into English labels.
enwp = pywikibot.Site('en','wikipedia')
wikidata = enwp.data_repository()
def english_label(qid):
page = pywikibot.ItemPage(wikidata, qid)
data = page.get()
return data['labels']['en']
sex_qs = [str(q) for q in norm_sex.columns]
sex_labels = [english_label(sex_q) for sex_q in sex_qs]
norm_sex.columns = sex_labels
norm_sex
VERBOSE:pywiki:Found 1 wikidata:wikidata processes running, including this one.
female animal | intersex | kathoey | Female | transgender female | male animal | male | female | transgender male | genderqueer | man | sodium | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
abwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.892308 | 0.107692 | 0.000000 | 0.000000 | 0 | 0 |
acewiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.840376 | 0.159624 | 0.000000 | 0.000000 | 0 | 0 |
afwiki | 0.000000 | 0.000000 | 0 | 0.000288 | 0.000000 | 0.000288 | 0.883573 | 0.115850 | 0.000000 | 0.000000 | 0 | 0 |
afwikiquote | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.946237 | 0.053763 | 0.000000 | 0.000000 | 0 | 0 |
akwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.866667 | 0.133333 | 0.000000 | 0.000000 | 0 | 0 |
alswiki | 0.000000 | 0.000000 | 0 | 0.000567 | 0.000000 | 0.000000 | 0.888322 | 0.111111 | 0.000000 | 0.000000 | 0 | 0 |
amwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.896104 | 0.103896 | 0.000000 | 0.000000 | 0 | 0 |
angwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.797665 | 0.202335 | 0.000000 | 0.000000 | 0 | 0 |
anwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.833623 | 0.166377 | 0.000000 | 0.000000 | 0 | 0 |
arcwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.882353 | 0.117647 | 0.000000 | 0.000000 | 0 | 0 |
arwiki | 0.000040 | 0.000040 | 0 | 0.000040 | 0.000079 | 0.000079 | 0.846166 | 0.153516 | 0.000000 | 0.000040 | 0 | 0 |
arwikiquote | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.933333 | 0.066667 | 0.000000 | 0.000000 | 0 | 0 |
arwikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.882353 | 0.117647 | 0.000000 | 0.000000 | 0 | 0 |
arzwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.712421 | 0.287158 | 0.000421 | 0.000000 | 0 | 0 |
astwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.842975 | 0.157025 | 0.000000 | 0.000000 | 0 | 0 |
aswiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.774336 | 0.225664 | 0.000000 | 0.000000 | 0 | 0 |
avwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 0.000000 | 0.000000 | 0.000000 | 0 | 0 |
aywiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.916667 | 0.083333 | 0.000000 | 0.000000 | 0 | 0 |
azwiki | 0.000174 | 0.000174 | 0 | 0.000000 | 0.000174 | 0.000174 | 0.825135 | 0.174170 | 0.000000 | 0.000000 | 0 | 0 |
azwikiquote | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.938017 | 0.061983 | 0.000000 | 0.000000 | 0 | 0 |
azwikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 0.000000 | 0.000000 | 0.000000 | 0 | 0 |
barwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.861856 | 0.138144 | 0.000000 | 0.000000 | 0 | 0 |
bat_smgwiki | 0.000000 | 0.001312 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.837270 | 0.161417 | 0.000000 | 0.000000 | 0 | 0 |
bawiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.915761 | 0.084239 | 0.000000 | 0.000000 | 0 | 0 |
bclwiki | 0.000000 | 0.002045 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.846626 | 0.151329 | 0.000000 | 0.000000 | 0 | 0 |
be_x_oldwiki | 0.000000 | 0.000000 | 0 | 0.000169 | 0.000000 | 0.000169 | 0.877053 | 0.122608 | 0.000000 | 0.000000 | 0 | 0 |
bewiki | 0.000000 | 0.000000 | 0 | 0.000099 | 0.000000 | 0.000099 | 0.876528 | 0.123273 | 0.000000 | 0.000000 | 0 | 0 |
bewikiquote | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 0.000000 | 0.000000 | 0.000000 | 0 | 0 |
bewikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.952381 | 0.047619 | 0.000000 | 0.000000 | 0 | 0 |
bgwiki | 0.000091 | 0.000045 | 0 | 0.000045 | 0.000091 | 0.000181 | 0.839242 | 0.160305 | 0.000000 | 0.000000 | 0 | 0 |
bgwikiquote | 0.000000 | 0.000000 | 0 | 0.000519 | 0.000000 | 0.000000 | 0.826155 | 0.173326 | 0.000000 | 0.000000 | 0 | 0 |
bgwikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.916667 | 0.083333 | 0.000000 | 0.000000 | 0 | 0 |
bhwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.875000 | 0.125000 | 0.000000 | 0.000000 | 0 | 0 |
biwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.787234 | 0.212766 | 0.000000 | 0.000000 | 0 | 0 |
bjnwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.650000 | 0.350000 | 0.000000 | 0.000000 | 0 | 0 |
bmwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.875000 | 0.125000 | 0.000000 | 0.000000 | 0 | 0 |
bnwiki | 0.000238 | 0.000000 | 0 | 0.000238 | 0.000238 | 0.000000 | 0.838102 | 0.161183 | 0.000000 | 0.000000 | 0 | 0 |
bnwikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 0.000000 | 0.000000 | 0.000000 | 0 | 0 |
bowiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.877778 | 0.122222 | 0.000000 | 0.000000 | 0 | 0 |
bpywiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.802817 | 0.197183 | 0.000000 | 0.000000 | 0 | 0 |
brwiki | 0.000000 | 0.000000 | 0 | 0.000173 | 0.000000 | 0.000000 | 0.751081 | 0.248746 | 0.000000 | 0.000000 | 0 | 0 |
brwikiquote | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.962963 | 0.037037 | 0.000000 | 0.000000 | 0 | 0 |
brwikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.904762 | 0.095238 | 0.000000 | 0.000000 | 0 | 0 |
bswiki | 0.000240 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000480 | 0.853033 | 0.146248 | 0.000000 | 0.000000 | 0 | 0 |
bswikiquote | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.871166 | 0.128834 | 0.000000 | 0.000000 | 0 | 0 |
bswikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 0.000000 | 0.000000 | 0.000000 | 0 | 0 |
bugwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.250000 | 0.750000 | 0.000000 | 0.000000 | 0 | 0 |
bxrwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.923729 | 0.076271 | 0.000000 | 0.000000 | 0 | 0 |
cawiki | 0.000026 | 0.000000 | 0 | 0.000026 | 0.000103 | 0.000129 | 0.870905 | 0.128786 | 0.000026 | 0.000000 | 0 | 0 |
cawikiquote | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.900506 | 0.099494 | 0.000000 | 0.000000 | 0 | 0 |
cawikisource | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.966667 | 0.033333 | 0.000000 | 0.000000 | 0 | 0 |
cbk_zamwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.716867 | 0.283133 | 0.000000 | 0.000000 | 0 | 0 |
cdowiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.900000 | 0.100000 | 0.000000 | 0.000000 | 0 | 0 |
cebwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.848739 | 0.151261 | 0.000000 | 0.000000 | 0 | 0 |
cewiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.939623 | 0.060377 | 0.000000 | 0.000000 | 0 | 0 |
chrwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.513514 | 0.486486 | 0.000000 | 0.000000 | 0 | 0 |
chwiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.916667 | 0.083333 | 0.000000 | 0.000000 | 0 | 0 |
chywiki | 0.000000 | 0.000000 | 0 | 0.000000 | 0.000000 | 0.000000 | 0.718750 | 0.281250 | 0.000000 | 0.000000 | 0 | 0 |
ckbwiki | 0.000000 | 0.000588 | 0 | 0.000000 | 0.000588 | 0.000000 | 0.893067 | 0.105758 | 0.000000 | 0.000000 | 0 | 0 |
commonswiki | 0.000045 | 0.000000 | 0 | 0.000045 | 0.000089 | 0.000268 | 0.810849 | 0.188614 | 0.000045 | 0.000045 | 0 | 0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
396 rows × 12 columns
sex_df['total'] = sex_df.sum(axis=1)
female_sorted_10000_items = norm_sex[sex_df['total']>10000].sort('female', ascending=True)
female_sorted_10000_items.plot(kind='bar', stacked=True, legend=True, figsize=(13,8), ylim=(0,1),
title= '''Comoposition of Wikidata Prorerty:P21 "Sex or Gender" by Language
(Languages with over 1,000 associated P21)''')
<matplotlib.axes.AxesSubplot at 0x7f272c8852d0>
or optionally come to the Sunday hackathon and we look at producing your idea, or help work on doing this same analysis but using a time component, probably by decade.