Programmer: Lifelong Learning: System Design: Learning from Existing Products

Pinterest Home Feed
- best-first rather than newest-first
- separation of available, but unseen, content and content that’s already been presented

- fan out write pins to followers
- batch write
standby HBase cluster
- speculative execution
- make another call to the standby cluster if the first call to primary cluster fails or doesn’t return within certain time
- randomly forward x percent of calls to warm up it

smart feed worker
- score/store pins: quality and meaning to users
- repins by followers, related pins, interest Pins
- 3 priority queues(hbase tables) for each user, use key-based sorting of HBase: userid, score, pin
Smart feed content generator
- pins selected are removed from pool(hbase table)
Smart feed service
materialized feed

discover interests - Pinterest
- n-grams
- verify with search queries, public domain ontologies like Wikipedia titles
- blacklist, stop words
- implicit and explicit signals from users

related interests
- co-occurrence relationship for interests

search query
- user's demographic stats
- store and use previous queries
- adjacent queries in the same search session

- clicks afert submit query

Twitter
- users,tweets,be-followed,following
- persistent connection between client and server
Add a twitter
- Write it to the queue, return to client
- in background, save to db, push to (active) follower's queue
Cache aggressively
Vector Cache
- ids(user, tweet)
- user's followers(id)

Row Cache
- tweet

Fragment Cache
Page Cache

- separated and put into a different machine

Home Timeline - link
user's timeline vs home timeline

- precomputed and cached in redis for active users
Mixed Pull+Push
- local decisions about each producer/consumer pair
- how often producer adds a event, how ofter consumer views the event
- celebrity twitter - Pull

- normal user twitter - Push
Flock maintains the follower and followings lists.

Search

- tweets are stored in memory Lucene

- rerank based on social proof, the number of retweets, favorites, and replies.

- so every time, a tweet is retweeted or etc, no need update

Lucene

Facebook
Separate data
Recent Activity - hot
- changes frequently, cache is frequently invalidated
- small, fast to regenerate them

Non-Recent Activity - cold
- a lot of data, slow to generate
- but don't change, cache will not be invalidated

Photo storage
- use cdn popular images
- use Haystack for other images - long tail
NFS photo infrastructure - old version
- huge metadata, most are not used(permission)
- use cdn to store images
- one(or more) to translate the filename into inode number, another to read the inode from disk, and a final one to read the file itself.
- need keep all inodes in memeory
- CDN good for hot data, but not so for cold data
long tail data
- requests for less popular (often older) content

Haystack - link
Haystack Object Store
- a log structured (append-only) object store containing needles representing the stored objects
- eliminates the metadata overhead by aggregating hundreds of thousands of images in a single haystack store file
- at most one disk operation per read, keep all metadata in main memory
- haystack store file
- index - needle's location, not critical, can be rebuilt, written asynchronously

needle is identified by Offset, Key, Alternate Key, Cookie
- Don't allow overwrite existing needle, instead: add a new version and its index
- needles with tuple: duplicate Key, Alternate Key, Cookie, the largest offset is the most recent one
Read - Pass needle offset, key, alternate key, cookie and the data size
Delete - set a deleted bit in the flags

- Fault-tolerant: replicates

Photo Store Server - http layer
- keeps a minimal in-memory index
- Google sparse hash data structure

Compaction
- reclaim the space
- copy needles while skipping any duplicate or deleted entries

Haystack Cache
- insulation if upstream CDN nodes fail and

Timeline - your story
News Feed - Multifeed
- multi-fetch and aggregate stories at read time
- aggregator(staeless) and leafs
- aggregator servers sends request to all leaf servers and ranks the results.
- all data in leafs is in memory (memcached)
- fetch recent activity from all your friends
- gather and Group into stories
- rank stories by relevance
- more flexible, manageable
- fan-out reads can be bounded. writes, often cannot

- SSD
- others's story that related to you
- rank edges(stories) by importance(relevance) - link
- content score: ∑e = ue we de, wherein ∑e is the sum of the edge's rank, ue is the affinity score with the user who created the edge, we is the weight for the content type, and de is a time decay factor

- a friend's affinity score: ∑i = li ni wi, wherein ∑i is the sum of the interactions with that friend, li is the time since your last interaction (this would need to be weighted so that 1 day > 30 days), ni is the number of interacts, and wi is the weight of those interactions

- Activity(id, user_id, source_id, activity_type, edge_rank, parent_id, parent_type, data, time)
- timeline - user's story
- newsfeed
connections and activities

"Push" Model, or Fan-out-on-write
- a lot of write, only push to active users for non-celebrity users
- use write-optimized data store such as Cassandra, HBase, or BigTable
"Pull" Model, or Fan-out-on-load
- less data, less computation
- materialized what user already viewed into a table, only pull new data
- have a fallback mechanism if fails to pull - show old materialized data

Combination of push and pull
- make decision locally on per consumer/producer basis
- a frequent producer of events may overshadow events from less frequent producers
Selective Fanout Based On Consumer
- only fan-out to active/valuable users
Selective Fanout Based On Producer

- don't fan-out for high profile users
Pull for new users
- because nothing pushed to the user
- when there is no enough data, pull from following

Make schema changes is expensive in traditional db
- such as add/remove indexes
- Users don't know what they don't know.

- All content is not created equal.

Reddit
- Treat nonlogged user as second class citizens
- Return cached content to them
- Put a limit on everything
- Provide timeout

- Expire data. Lock out old threads and create a fully rendered page and cache it.

Red Envelop
- store count, balance into redis
- use redis
- decr count, CAS to decrease balance
- ttl

Uber
Driver - Long connection
- matches drivers and riders, the supply and demand
Maps/ETA (estimated time of arrival)

Nutch2
- plugins, extensible
- fetching pages, filtering and normalizing URLs and parsing responses

Etsy - link
$choose_connection = mt_rand() < $affinity;
Pruning Same/similiar message
combine similar stories into rollups
- group similar storeis

Cache
- seperate global and unique to the person
Cache Warming - gearman
- proactively
- TTL Caching
Lazy Below the Fold

JD
数据闭环,异构,聚合

EFK - Log collection/visualization
logstash/fleuent -> kafka -> Elasticsearch -> Kibana

System Design: Learning from Existing Products

Labels