Social Network Analysis With Python @ PyCon APAC 2014David Chiu
About Me Co-founder of Ex-Trend Micro Engineer NumerInfo ywchiu.com
Social Network http://libeltyseo.com/wp-content/uploads/2013/03/social-networking.png
Human Nature http://cdn.macado.com/assets/2010/03/peeping-tom.gif
What do we want to know? Who knows whom, and which people are common to their social networks? How frequently are particular people communicating with one another? Which social network connections generate the most value for a particular niche? How does geography affect your social connections in an online world? Who are the most influential/popular people in a social network? What are people chatting about (and is it valuable)? What are people interested in based upon the human language that they use in a digital world?
Explore Facebook
OAuth2 Flow Open standard for authorization. OAuth provides a method for clients to access server resources on behalf of a resource owner
Connect to Facebook https://developers.facebook.com/
Get Access Token https://developers.facebook.com/tools/explorer/
User Permission
Permission List User Data Permissions: user_hometown user_location user_interests user_likes user_relationships Friends Data Permissions: friends_hometown friends_location friends_interests friends_likes friends_relationships Extended Permissions: read_friendlists
Copy Token
Social Network Analysis With Python Let's Hack
Get Information From Facebook
Test On API Explorer
Required Packages requests Sending HTTP Request to Retrieve Data From Facebook json For Parsing JSON Format
Facebook Connect import requests import json access_token="<access_token>" url = "https://graph.facebook.com/me?access_token=%s" response = requests.get(url%(access_token)) fb_data = json.loads(response.text) print fb_data
Question: Who Likes My Post The Most?
Get Likes Count of Posts access_token = '<access_token>' url="https://graph.facebook.com/me/posts?access_token=%s" response = requests.get(url%(access_token)) fb_data = json.loads(response.text) count_dic = {} for post in fb_data['data']: if 'likes' in post: for rec in post['likes']['data']: if rec['name'] in count_dic: count_dic[rec['name']] += 1 else: count_dic[rec['name']] = 1
Simple Ha! Ask Harder Question!
Question: What's People Talking About
Take Cross-Strait Agreement As Example keyword_dic = {} posts_url = 'https://graph.facebook.com/%s/posts?access_token=%s' post_response = rs.get(posts_url%(userid, access_token)) post_json = json.loads(post_response.text) for post in post_json['data']: if 'message' in post: m = re.search('服貿', post['message'].encode('utf-8')) if m: if userid not in keyword_dic: keyword_dic[userid] = 1 else: keyword_dic[userid] += 1
Text Mining NLTK!
Sorry! My Facebook Friends Speak In Mandarin Jieba!
Using Jieba For Word Tokenization import jieba data = post_json['data'] dic = {} for rec in post_json['data']: if 'message' in rec: seg_list = jieba.cut(rec['message']) for seg in seg_list: if seg in dic: dic[seg] = dic[seg] + 1 else: dic[seg] = 1
Question: How to Identify Social Groups?
Required Packages networkx Analyze Social Network community Community Detection Using Louvain Method
Social Network Man As , Connection AsNode Edge
Build Friendship Matrix import networkx as nx mutual_friends = {} for friend in friends_obj['data']: mutual_url = "https://graph.facebook.com/%s/mutualfriends?access_token=%s" res = requests.get( mutual_url % (friend['id'], access_token) ) response_data = json.loads(res.text)['data'] mutual_friends[friend['name']] = [ data['name'] for data in response_data ] nxg = nx.Graph() [ nxg.add_edge('me', mf) for mf in mutual_friends ] [ nxg.add_edge(f1, f2) for f1 in mutual_friends for f2 in mutual_friends[f1] ]
Draw Network Plot nx.draw(nxg)
Calculate Network Property betweenness_centrality(nxg) degree_centrality(nxg) closeness_centrality(nxg)
Community Detection import community def find_partition(graph): g = graph partition = community.best_partition(g) return partition new_G = find_partition(nxg)
Draw Social Network Communities import matplotlib.pyplot as plt size = float(len(set(new_G.values()))) pos = nx.spring_layout(nxg) count = 0. for com in set(new_G.values()) : count = count + 1. list_nodes = [nodes for nodes in new_G.keys() if new_G[nodes] == com] nx.draw_networkx_nodes(nxg, pos, list_nodes, node_size = 20, node_color = str(count / size)) nx.draw_networkx_edges(nxg,pos, alpha=0.5) plt.show()
Community Partitioned Plot
Gephi Gephi, an open source graph visualization and manipulation software
One More Thing
To build your own data service
jsnetworkx A JavaScript port of the NetworkX graph library.
juimee.com
THANK YOU

PyCon APAC 2014 - Social Network Analysis Using Python (David Chiu)