Designing Data Structure

During programming, it's important to design and use right data structures.

Here is a list of problems that we can use to improve our data structure design skills.

Design an in-memory search engine
How to index and store in memory
How to support free text queries, phrase queries
Map<String, List<Document>>
List<Document> is sorted
Document: docId, List<Long> positionIds
List<Long> is sorted
How to save to files
How to merge small files into big files
-- When save to file, make it sorted by word
-- Use merge sort to merge multiple fields
How to make it scalable - how solr cloud works

Google – Manager Peer Problem
1. setManager(A, B) sets A as a direct manager of B
2. setPeer(A, B) sets A as a colleague of B. After that, A and B will have the same direct Manager.

3. query(A, B) returns if A is in the management chain of B.
Tree + HashMap
Map<Integer, TNode> nodeMap

TNode: value, parent, neighbors

Design an Excel sheet’s Data structure
http://codeinterviews.com/Uber-Design-Excel/
You need to perform operations like addition. The excel sheet is very sparse and is used to store numbers in the range 1-65K. Index for a cell is known.
Sparse table: Map<Integer, Map<Integer, String>> data
Follow-up question: In excel, one cell can refer to other cells, if I update one cell, how do you update all the dependent cells?
--Topological sort

- Use multiple data structures
Design a data structure that supports insert, delete, search and getRandom in constant time
private List<String> list = new ArrayList<String>();
Map<String,Integer> indexes = new HashMap<String,Integer>();
-- When remove key, swap the old element in list with the last element, and change the last element index to its new location

Follow up
- What if the value may be duplicated?
- How to test getRandom()?
-- Implement addItem - O(1), getTop10Items
-- Implement HashTable with get,set,delete,getRandom functions in O(1).

Implement Get and Insert for TimeTravelingHashTable
- insert(key, value, timestamp)
- get(key, timestamp)
- get(key) // returns value associated with key at latest time.
Map<K, TreeMap<Float, V>> keyToBSTMap = new HashMap<>();
public V get(K k, Float time){
        if(keyToBSTMap.containsKey(k) == false) return null;
        if(keyToBSTMap.get(k).containsKey(time))
                return  keyToBSTMap.get(k).get(time);
        else
                return  keyToBSTMap.get(k).lowerEntry(time).getValue();
}

Stack supports getMin or getMax
Stack<Integer> main = new Stack<>();
Stack<Integer> minStack = new Stack<>();

Design a stack with operations(findMiddle|deleteMiddle) on middle element
-- Use Double Linked List

LeetCode 311 - Sparse Matrix Multiplication
https://discuss.leetcode.com/topic/30631/my-java-solution
Map<Integer, HashMap<Integer, Integer>> tableA, tableB

Design SparseVector:
-- supports dot(SparseVector that)
Design SparseMatrix
-- supports plus(SparseMatrix that)

Google - remove alarm
https://reeestart.wordpress.com/2016/06/30/google-remove-alarm/
hash map - map priority to set of alarm id

max priority heap - PriorityQueue<Integer>

Recover a Quack Data Structure
Copy its all elements in to an array
--it's pop()/peek() randomly remove or return first or last element

Two Sum III - Data structure design
add - Add the number to an internal data structure.
find - Find if there exists any pair of numbers which sum is equal to the value.
1. O(1) add, O(n) find
2. O(n) add, O(1) find

Add and Search Word
search(word) can search a literal word or a regular expression string containing only letters a-z or .. A . means it can represent any one letter.

Shortest Word Distance II
Design a class which receives a list of words in the constructor, and implements a method that takes two words word1 and word2 and return the shortest distance between these two words in the list.


Your method will be called repeatedly many times with different parameters.
Map<String, List<Integer>> map:  word -< sorted index

Post a Comment

Labels

Java (159) Lucene-Solr (110) All (60) Interview (59) J2SE (53) Algorithm (37) Eclipse (35) Soft Skills (35) Code Example (31) Linux (26) JavaScript (23) Spring (22) Windows (22) Web Development (20) Tools (19) Nutch2 (18) Bugs (17) Debug (15) Defects (14) Text Mining (14) J2EE (13) Network (13) PowerShell (11) Chrome (9) Continuous Integration (9) How to (9) Learning code (9) Performance (9) UIMA (9) html (9) Design (8) Dynamic Languages (8) Http Client (8) Maven (8) Security (8) Trouble Shooting (8) bat (8) blogger (8) Big Data (7) Google (7) Guava (7) JSON (7) Problem Solving (7) ANT (6) Coding Skills (6) Database (6) Scala (6) Shell (6) css (6) Algorithm Series (5) Cache (5) IDE (5) Lesson Learned (5) Miscs (5) Programmer Skills (5) System Design (5) Tips (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) OpenNLP (4) Project Managment (4) Python (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Firefox (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Google Drive (2) Gson (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Bit Operation (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Troubleshooting (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts