Saturday, 2 April 2016

Mac Shortcuts








Wednesday, 30 March 2016

MariaDB : Should I use it to replace MySQL?

1. What is MariaDB?

  • It is designed as a replacement of MySQL. The user of MySQL should be able to switch to MariaDB without change the data in the database.
  • MariaDB is open-source. GNU.

2.

Saturday, 26 March 2016

Some notes from MySQL

1. Command Line:
https://www.digitalocean.com/community/tutorials/a-basic-mysql-tutorial

Monday, 21 March 2016

Self Reflection

1. Career Strength: T-shape skill sets
    Horizontal skills: Product skills.
    Vertical skills: Data modelling skills.

2. 1 Day Reaction Rule
    If I like it, I will do it in that day.
    If I like it but I did not do it that day, it means I do not like it that much.
    If I did not do anything I like in a week, it means a failure week.


Thursday, 10 March 2016

How to Design App Store

1. Scenario

1) List Apps
2) Search Apps
3) View an App
4) Install & Remove App

2. Numbers


3. Applications


4. Framework


5. Used Libs















(source: bittiger.io)


Saturday, 27 February 2016

Network operation team 
Network operation automation
Tools/machines to make the right decision before taking the remediation actions
Network health service is the tool to provide consultant service for tools or human to make the correct decision. 

Knowledge:
Amazon’s network architecture
Amazon's network configuration and management

How to monitor web service?
Levels:
Matrices:
Dashboard GUI Design:

CloudWatch Dashboards

Dashboard Software - Tableau.com‎

Google Analytics

Wednesday, 24 February 2016

Amazon Leadership Principles

1. Customer Obsession

Leaders start with the customer and work backwards. They work vigorously to earn and keep customer trust. Although leaders pay attention to competitors, they obsess over customers.

Situation - 2014-Sep, our team started one project with Transport Department, NSW. The project's scope is to investigate the benefits of smart motorway signal control in Sydney. 

Task - My role is the principle project investigator. My task is to deliverer a Transport Model to the client in 3 months. more importantly, This is our first project with the client, my task is also to earn the client's trust. 

Activity - For this project, I found that two most important clients in the project: one is a senior transport modeler Christopher and the other is a general transport manager Martin. They both have strong technical background. So, during these three months, I follow two rules in the communication with clients:
1: Heavy client involvement.
2: Focusing on Data.

Result - I would like to mention some results of our team's work:
1: Clients asked lots of technical questions and involve in some technical decisions.
2: This project was later demonstrated on television to the Premier Mike Baird, as a precursor to the Sydney M4 Smart Motorway Project. 
3: Our team signed a much bigger (3+M) project with the client in the end of 2015.
4: When Christopher changed his job to TransUrban, this client relationship brings us new opptunities.

2. Ownership
Leaders are owners. They think long term and don’t sacrifice long-term value for short-term results. They act on behalf of the entire company, beyond just their own team. They never say “that’s not my job".

Situation - 2015-Oct, our team just finished one research Project "Sydney City-scale Transport Insights (SCTI)" and at that time there was one commercial conference hosted by TransUrban in Sydney. It became a good opportunity for us to introduce this project to the potential client.

Task - However, our biz college and my manager have to join another conference in France. So, I volunteered myself to join the event and presented our project to the Technical Manager TransUrban, even through it is not part of my job and definitely not my strength.

Result - the Technical Manager Micheal got interested in our project and later on introduced her college Peterson (the Transport Manager) to me. Then, we scheduled a 2-Hour meeting and agreed to signed the NDA and start a collaboration project. We are currently in the legal process.  

3. Invent and Simplify
Leaders expect and require innovation and invention from their teams and always find ways to simplify. They are externally aware, look for new ideas from everywhere, and are not limited by “not invented here". As we do new things, we accept that we may be misunderstood for long periods of time.

Situation - In the area of Transport Modelling, we have developed our own innovation in 2015 and published the innovation in the best Transportation Journal. We were awarded the ITS Research 2015 because of the innovation. This work is one of our core competitiveness and I am proud of this work. However, recently I read another trend of innovation. And I think in the long round, the other method is better than ours.

Task - As the transport research leader, I need to decide whether we need to reinvent our method. 

Result - I talked with my manager and we decided to start a small project to reinvent our method. I think it is a correct decision.

4. Are Right, A Lot

Leaders are right a lot. They have strong business judgment and good instincts. They seek diverse perspectives and work to disconfirm their beliefs.

Situation - In Feb 2016, our team decided to step into a new research area named Dynamic Traffic Modelling.

Task - It was my responsibility to set up the technical framework and point out our research opportunity.

Action - A technical framework consists of a number of decisions. If the decision is about Transportation, Algorithms and System Design, I will make decisions and explain to our team. If the decision is about Machine Learning and Mathematics, I will discuss the decision with my senior college. 

5. Hire and Develop The Best

Leaders raise the performance bar with every hire and promotion. They recognize exceptional talent, and willingly move them throughout the organization. Leaders develop leaders and take seriously their role in coaching others.  We work on behalf of our people to invent mechanisms for development like Career Choice.

Not much experience.

6. Insist on the Highest Standards

Leaders have relentlessly high standards - many people may think these standards are unreasonably high. Leaders are continually raising the bar and driving their teams to deliver high quality products, services and processes. Leaders ensure that defects do not get sent down the line and that problems are fixed so they stay fixed.

Situation - In 2012, when I was in my PhD. One of my friend from MIT and I were working on Traffic Model Calibration. The measurement of our job is the error between the model's output and the real-world measurement. Usually, when the error is <20%, we would think the model is valid.

Action - However, both my friend and me were kind of obsessed with reducing the error. We will take about methods and spend days doing experiments to reduce errors even by 0.1%.

7. Think Big

Thinking small is a self-fulfilling prophecy. Leaders create and communicate a bold direction that inspires results. They think differently and look around corners for ways to serve customers.

No experience here.

8. Bias for Action

Speed matters in business. Many decisions and actions are reversible and do not need extensive study. We value calculated risk taking. 

9. Frugality
Accomplish more with less. Constraints breed resourcefulness, self-sufficiency and invention. There are no extra points for growing headcount, budget size or fixed expense.

10. Learn and Be Curious

Leaders are never done learning and always seek to improve themselves. They are curious about new possibilities and act to explore them.

Situation - My career aim is to keep track of new ideas and convert my ideas into products. Following this way, I will think of where I can improve myself in Algorithm, System Design and Data Analytics.

Action: For example, recently I found an online web seminar: BiiTiger.com, the host has been working on Tie-1 Technology company for 10+ years and will explain the design of one hot technique or product every week. If I am not busy, I will spent the whole weekend in following the online course. I plan to follow it in the next 1 year. It will definitely improve my skills in System Design.

11. Earn Trust

Leaders listen attentively, speak candidly, and treat others respectfully. They are vocally self-critical, even when doing so is awkward or embarrassing.  Leaders do not believe their or their team’s body odor smells of perfume.  They benchmark themselves and their teams against the best.

12. Dive Deep

Leaders operate at all levels, stay connected to the details, audit frequently, and are skeptical when metrics and anecdote differ. No task is beneath them.

13. Have Backbone; Disagree and Commit

Leaders are obligated to respectfully challenge decisions when they disagree, even when doing so is uncomfortable or exhausting. Leaders have conviction and are tenacious. They do not compromise for the sake of social cohesion. Once a decision is determined, they commit wholly.

14. Deliver Results

Leaders focus on the key inputs for their business and deliver them with the right quality and in a timely fashion. Despite setbacks, they rise to the occasion and never settle.

Situation - 2014-Sep, our team started one project with Transport Department, NSW. 

Task - My role is the principle project investigator. My task is to deliverer a Transport Model to the client in 3 months. more importantly, This is our first project with the client, my task is also to earn the client's trust. However, I found that I do not have sufficient time to evaluate the model according to the clients' requirement. 

Activity - The good thing is that during the project, our team has shown our professions to our clients. Our relationship is smooth. So, I scheduled a meeting with two senior guys from the client company and proposed to evaluate the model using one method instead of two methods. 

Result - I would like to mention some results of our team's work:
1: Clients agreed with my proposal but asked to explain the evaluation process in the report. It was very reasonable.
2: This project was delivered in time and later demonstrated on television to the Premier Mike Baird, as a precursor to the Sydney M4 Smart Motorway Project. 
3: Our team signed a much bigger (3+M) project with the client in the end of 2015.


Monday, 22 February 2016

System Design Blogs

1. 推荐系统的理论
https://www.ibm.com/developerworks/cn/web/1103_zhaoct_recommstudy1/

Keywords:
Demographic-based Recommendation
Cold Start
Content-based Recommendation【新品推荐】
Collaborative Filtering-based Recommendation
User-based Recommendation【主流推荐】
K-Neighbor
Item-based Recommendation
Model-based Recommendation
Weighted Hybridization、Switching Hybridization,Mixed Hybridization

补充:

什么是协同过滤

协同过滤是利用集体智慧的一个典型方法。要理解什么是协同过滤 (Collaborative Filtering, 简称 CF),首先想一个简单的问题,如果你现在想看个电影,但你不知道具体看哪部,你会怎么做?大部分的人会问问周围的朋友,看看最近有什么好看的电影推荐,而我们一般更倾向于从口味比较类似的朋友那里得到推荐。这就是协同过滤的核心思想。
协同过滤一般是在海量的用户中发掘出一小部分和你品位比较类似的,在协同过滤中,这些用户成为邻居,然后根据他们喜欢的其他东西组织成一个排序的目录作为推荐给你。
补充2:
对于 User CF,推荐的原则是假设用户会喜欢那些和他有相同喜好的用户喜欢的东西,但如果一个用户没有相同喜好的朋友,那 User CF 的算法的效果就会很差,所以一个用户对的 CF 算法的适应度是和他有多少共同喜好用户成正比的。
Item CF 算法也有一个基本假设,就是用户会喜欢和他以前喜欢的东西相似的东西,那么我们可以计算一个用户喜欢的物品的自相似度。一个用户喜欢物品的自相似度大,就说明他喜欢的东西都是比较相似的,也就是说他比较符合 Item CF 方法的基本假设

2. 密码保存算法
http://www.zhihu.com/question/20479856
Salt + Hash + Delay

3. 一致性解释
http://blog.csdn.net/duxingxia356/article/details/43992015

4. What really happens when you navigate to a URL
http://igoro.com/archive/what-really-happens-when-you-navigate-to-a-url/

5. REGULAR EXPRESSION
https://regex101.com/

6. 并发处理
Java 基本:http://langgufu.iteye.com/blog/2152608
Java Conditional Lock: http://www.cnblogs.com/skywang12345/p/3496716.html
Java Semaphore: http://www.cnblogs.com/skywang12345/p/3534050.html
Lock-Free Implementation: http://ifeve.com/lmax/


Wednesday, 3 February 2016

Combination and Permutation

Concept:

The fundamental difference is:
  • Combination does not care ORDER
  • Permutation cares ORDER
Combination:

1. Without duplicate numbers (Any Length)
*There are 2^n solutions.
  for (int i = 0; i < 2^n; i++) {
    find the solution for i.
  }

 2.  Without duplicate numbers (Only Length k)
        if (solution.size() == k) {
            rst.add(new ArrayList(solution));
            return;
        }        

        for (int i = start; i <= n; i++) {// <=n because we want 1 ~ n in this problem
            solution.add(i);
            helper(rst, solution, n, k, i + 1);
            solution.remove(solution.size() - 1); //Back-track
        }

3. With duplicate numbers (but not allowing replication [each number can be used at most 1 time])
1) Build A HashSet
2) Sort the numbers
    Then, the idea is to avoid the same number in the same level.
    int prev = -1;//Repeat detection
        for (int i = start; i < num.length; i++) {
            if (prev != -1 && prev == num[i]) {
                continue;
            }

            list.add(num[i]);
            sum += num[i];
            helper(rst, list, num, target, sum, i + 1);

            //Back track:
            sum -= num[i];
            list.remove(list.size() - 1);

            //Repeat Detection
            prev = num[i];
        }

4. With duplicate numbers (allowing replication)
1) Build A HashSet
2) Sort the numbers
    Then, the idea is to avoid the same number in the same level.
    int prev = -1;//Repeat detection
        for (int i = start; i < num.length; i++) {
            if (prev != -1 && prev == num[i]) {
                continue;
            }

            list.add(num[i]);
            sum += num[i];
            helper(rst, list, num, target, sum, i);   //allow replication

            //Back track:
            sum -= num[i];
            list.remove(list.size() - 1);

            //Repeat Detection
            prev = num[i];
        }

Permutation:

1. Basic Version (No duplication)
for (int i = 0; i < nums.size(); i++) {
            if (accessed[i] == true) {
                continue;
            }
            
            accessed[i] = true;
            list.add(nums.get(i));
            permuteRecursively(results, list, accessed, nums);
            accessed[i] = false;
            list.remove(list.size() - 1);
        }
2. Basic Version (No duplication) But No recursion
Queue<ArrayList<Integer>> queue = new LinkedList<ArrayList<Integer>>();
        ArrayList<Integer> list;
        for (int num : nums) {
            list = new ArrayList<Integer>();
            list.add(num);
            queue.offer(new ArrayList<Integer>(list));
        }

        while (!queue.isEmpty()) {
            list = queue.poll();
            if (list.size() == nums.size()) {
                rst.add(new ArrayList<Integer>(list));
                continue;
            }
            for (int i = 0; i < nums.size(); i++) {
                if (!list.contains(nums.get(i))) {
                    list.add(nums.get(i));
                    queue.offer(new ArrayList<Integer>(list));
                    list.remove(list.size() - 1);
                }
            }
        }

3. (With duplication)
    Sort nums 
    It is the magic part:

if (list.size() == nums.size()) {
            rst.add(new ArrayList<Integer>(list));
            return;
        }

        for (int i = 0; i < nums.size(); i++) {
            if (mark[i] || (i != 0 && mark[i - 1] && nums.get(i) == nums.get(i - 1))) {
                continue;
            }           
            list.add(nums.get(i));
            mark[i] = true;
            dfs(nums, list, rst, mark);
            list.remove(list.size() - 1);
            mark[i] = false;
        }


Tuesday, 12 January 2016

Error Type

Dynamic Programming

  • [N+1] for string
  • A.charAt(N) ERROR
  • Return not dp[N] or dp[N][M], but some MAX or MIN

LinkedList
  • Remove Node p (setting p.next = null)
  • break link (setting the left end's next = null)

Comparator
  • private Comparator<ListNode> listNodeComparator = new Comparator<ListNode>() {
  •         public int compare(ListNode left, ListNode right) {
  •             if (left == null) {
  •                 return 1;
  •             } else if (right == null) {
  •                 return -1
  •             }
  •             
  •             return left.val - right.val;
  •         }
  •     };

Classic Data Structures and Functions


1. ArrayList
  • void add(int index, Object element)
  • boolean add(Object o)
  • boolean addAll(Collection c)
  • boolean addAll(int index, Collection c)
  • void clear()
  • Object clone(): Returns a shallow copy of this ArrayList.
  • boolean contains(Object o)
  • void ensureCapacity(int minCapacity)
  • Object get(int index)
  • int indexOf(Object o)
  • int lastIndexOf(Object o)
  • Object remove(int index)
  • protected void removeRange(int fromIndex, int toIndex)
  • Object set(int index, Object element)
  • int size()
  • Object[] toArray()
  • Object[] toArray(Object[] a)
  • void trimToSize()
2. HashMap

  • void clear()
  • Object clone(): Returns a shallow copy of this HashMap instance.
  • boolean containsKey(Object key)
  • boolean containsValue(Object value)
  • Set entrySet()
  • Object get(Object key)
  • boolean isEmpty()
  • Set keySet()
  • Object put(Object key, Object value)
  • putAll(Map m)
  • Object remove(Object key)
  • int size()
  • Collection values()


2.2 HashSet
  • add()
  • clear()
  • clone()
  • contains()
  • isEmpty()
  • remove()
  • size()

3. Queue / LinkedList

  • boolean add(E e)
  • void add(int index, E element)
  • boolean addAll(Collection<? extends E> c)
  • void clear()
  • Object clone()
  • boolean contains(Object o)
  • E get(int index)
  • E peek()
  • E poll()
  • int size()
  • Object[] toArray()
  • public boolean isEmpty()
4. Stack
  • boolean empty()
  • Object peek( )
  • Object pop( )
  • Object push(Object element)
  • int search(Object element)
5. StringBuilder 

  • StringBuilder append(char c)
  • StringBuilder append(char[] str)
  • StringBuilder append(String s)
  • StringBuilder delete(int start, int end)
  • StringBuilder deleteCharAt(int index)
  • void setCharAt(int index, char c)
  • void insert(int index, char/string s)
  • StringBuilder reverse()
  • String toString()
  • int length()
6. String

  • char charAt(int index)
  • boolean endsWith(String suffix)
  • byte getBytes()
  • int hashCode()
  • int length()
  • boolean matches(String regex)
  • String replace(char oldChar, char newChar)
  • String replaceAll(String regex, String replacement)
  • String replaceFirst(String regex, String replacement)
  • String[] split(String regex)
  • boolean startsWith(String prefix)
  • String substring(int beginIndex)
  • String substring(int beginIndex, int endIndex)
  • char[] toCharArray()
  • String toLowerCase()
  • String toUpperCase()
  • String trim()
7. Heap / Priority Queue

  • boolean add(E e)
  • void clear()
  • boolean contains(Object o)
  • Iterator<E> iterator()
  • boolean offer(E e)
  • E peek()
  • E poll()
  • boolean remove(Object o)
  • int size()
  • Object[] toArray()
  • boolean isEmpty()

Thursday, 7 January 2016

Design '秒杀'

1 Cases:

Case 0: A buyer searches product list.
Case 1: A buyer makes an order by clicking the "Order" button.
Case 2: A buyer makes an order by frequently clicking the "Order" button.
Case 2: A buyer makes an order by sending HTTP requests using software.

2 Constraint:

Assuming the number of active buyers in the day is 100 Million Buyers.
Assuming the number of buyers during a peak second is: 100 Million Buyers / (12 * 3600) * 10 = 23150 Buyers. Assuming each buyer requests for 1 product in the peak second, there are 23150 requests per second.

Assuming that there are 10 Million products.

3. Data

Product
(Database: MySQL)

Transaction
(Database: MySQL)

4. System


Wednesday, 6 January 2016

Design Google Analytics

1: Scenario

1. A user register/remove/log in/out Google Analytics.
2. A user register/remove a website in Google Analytics.
3. A user access a report in Google Analytics.
4. A visitor visits a web page and send some data to Google Analytics.

2: Constraints:

Assumed No. of Uses: 0.5 Million.
Assumed No. of websites: 1 Million.
Assumed No. of webpages: 1 Million * 100 = 100 Million.
Assumed No. of new websites per day: 1 Million * 0.1% = 1 K
Assumed No. of new websites per a peak second: 1 Million * 0.1% / (24 * 3600) * 10 = 0.1
Assumed No. of webpages access per day: 1 Million * 100 * 0.1 = 10 Million;
Assumed No. of webpages access in a peak second: 10 Million / (24 * 3600) * 10 = 1,157 visits;
Assumed No. of report checks per day: 1 Million
Assumed No. of report checks in a peak second: 1 Million / (24 * 3600) * 10 = 116 visits;
Assumed Traffic in a peak second: 116 * 1Mb/s + 1157 * 100kb/s = 232 Mb/s
Assumed Memory in a peak second (in memory report): 1 Million * 100 Kb = 100 Gb

3. Data

User Account
Re-use Google Account

Website Account & Webpage Access Record
Database (e.g. NoSQL)

Website Analytics Report
Database (e.g. NoSQL) + In Memory Database

4. Key Algorithm
Real-Time Report Generation

5. All In One

#In this framework, the write operations are very heavy and the read operations can allow some delays. So, I decide to split the read and write database for efficiency.

Tuesday, 5 January 2016

Design a distributed crawl system

1. Constraint

Assuming 1 trillion web pages
Assuming each page has 10 links to other web pages on average
Assuming each web page is updated every 7 days on average

So, the task is to crawl 1 trillion web pages in 7 days, which is 1,653,439 web pages per second.
So, the peak hour task is 1,653,439 * 5 = 8,267,195 web pages per second.

Assuming that each machine can process 10 web pages per second. It requires 826,720 machines.
Assuming that each data center has 10,000 machines. It requires 83 data centers.

Assuming that each master machine can manage 1000 slave machines, it needs 830 master machines (each data center needs 10 master machines). A partitioning method is required between master machines. (e.g. prefix)

Assuming there are 1 million key words and each keyword is associated with 100,000,000 web pages. The data size requirement is 1,000,000 * ( [20 Bytes] + 100,000,000 * [20 Bytes]) = 
2,000,000,000,000,000 Byte = 2,000 TB.

Assuming that each machine has 1TB space, there is 2,000 machines.

2. Framework


Design User System

Scenario:

  • Log in/out
  • Register/Remove
  • Update
  • Forgot Password

Necessity:

  • No of Users
  • No of Peak users
  • Frequency of operators


Database (User Table):

  • long userID; P_KEY & Index
  • string userName; //max length as 50
  • string hiddenPass; //Not revertible and not easy to break using a brute force way (e.g. Hash).
  •                               //On-side encryption is not a good idea.
  •                               //Encryption with private key is still used (e.g. SSH).
  • int status; //activated? ban?


In Memory:

  • class User {
    • long userID; //P_KEY
    • string userName;
    • string hiddenPass;
    • int status;
  • }

  • class Session {
    • long sessionID; //P_KEY
    • long userID;
    • int devideType;
    • long timeOut; //ms
  • }

Design PayPal

1. Cases:

Case 1: A user open PayPal from some websites (e.g. shopping)
Case 2: A user login PayPal
Case 3: A user pay money using PayPal
Case 4: A user search payment history

Case 5: A seller login PayPal
Case 6: A seller configure payment when user access PayPal
Case 7: A seller search payment history

2. Constraints

Users: 100 Million users
Daily Active user: (assuming 1%) 1 million users
Average Transaction Per Second: 1 million users * 1.5 / (3600 * 24) = 17.36
Peak Transaction Per Second: 17.36 * 10 = 173.10 Tran/Sec

Assuming one machine's capacity is 10 trans/sec, requiring 18 machines. 

Daily new user (assuming 0.1%): 100,000 users
Average new user Per Second: 100,000 / (3600 * 24) = 1.57
Peak new user Per Second: 1.57 * 10 = 15.7 User/Sec

Assuming one machine's capacity is 10 users/sec, requiring 2 machines. 

3. Service:

User Service
(Register, Update, Remove, Login, Logout)

Transaction Service
(TranferFromOtherSystemInPaypal, TransferFromPaypalOut, TransferBetweenPaypals)

Record Service
(Insert, Delete, Search)

4. Data

User Account Data
(Database: MySQL)

Transaction Records
(Database: MySQL (double check functions))

Transaction Log
(Files)

5. Core Algorithms

1. Information Hashing
     
2. Transaction ACID
     Atomicity: Log first (Commit or Rollback)
     Consistency: Consistency checker
     Isolation: Independent Transactions
     Durability: Redundancy

6. All in One

Privacy Manager's Input: user name and password
Privacy Manager's Output: Hashed strings


Monday, 4 January 2016

Design UberX

1. Scenarios:

Case 1: A driver starts its service and drive around.
             The driver should see his/her locations moving on map.
Case 2: A driver accepts/denies a trip request.
Case 3: A driver takes a rider to the destination.
             The fee should be updated frequently.
Case 4: A driver dhows the fee and print the receipt for the rider.

Case 5: A rider searches nearby cabs.
Case 6: A rider posts a request and read the response.
Case 7: A rider find the cab.
Case 8: A rider takes the cab to the destination.
Case 9: A rider pay the fee and get the receipt.

2. Constraints:

Assumed number of full-time Drivers: 1 Million
Assumed number of part-time Drivers: 1 Million
Assumed number of Drivers during a peak hour: 1 Million * 80% + 1 Million * 20% = 1 Million

Assumed Average Trip Length: 20 minutes
Assumed Number of Trips during Peak hour: 2 Million trips

Assumed Peak Trips in a second: 2 Million trips / 3600 * 2 = 1111 trip/second

Assumed Memory Requirements: 2 Million * 20 / 60 * 1 Kb = 667 Mb
Assumed Traffic Requirements: 2 Million * 20 / 60 * 10 Kb/s = 6.67 Gb/s

Assumed Data Storage Requirements Per peak hour: 2 Million trips * 1kb = 2 Gb

3. Service:
  • 1. UberX Monitoring Service (both Demand and Supply)
  • 2. Application Service
  • 3. Dispatch Service
  • 4. Business Logic Service (Fee & Receipt)
  • 5. Payment Service (3th Party)
  • 6. User Service
4. Data

Demand and Supply (including gps updating)
No SQL

Business Logic Service
No SQL

Dispatch
No SQL

User Service
Database (MySQL)

Payment Histrory
Database (MySQL)

5. All Together


6. Key Algorithms / Methods
  • Dispatching Algorithm  (Minimizing Estimated Arrive Time)
  • Arrive Time Estimation Algorithm


Sunday, 3 January 2016

Design Netflix

1. Cases

Case 1: A user loges in to Netflix
    case sensitive? verification code? forgot password? first time log in?
Case 2: A user searches Netflix
    search by keywords, search by time, search by players, search by catalog.
Case 3: A user plays a movie
    pause, faster, skip the beginning, full screen.
Case 4: A user writes a comment
    comment format? remove comments?
Case 5: A user pays his/her subscription fees
    security

2. Constraints

2015 Oct: 69.17 million subscribers ~ 70 million

Active Users in Peak Time:
70 million * 30 minutes / (24 * 60) * 10 [Peak] = 14.6 million

Active Traffic Requirement in Peak Time:
14.6 million * 200,000 Byte/s = 2.92 TB/s

Storage (minimum):
Movie: 500,000 * 1 GB = 500 TB
Comments: 70 million * 1000 * 1000 Byte: 70 TB

3. Service

User Account Service

Movie & Catalog Service

Comment & Recommendation Service

4. Data

User Account Data
MySQL or PostgreSQL

Movie & Catalog Service
Files & Mongodb (Flexible Schema)

Comment & Recommendation Service
Mongodb (Large Data Size)

5. All Together

Friday, 1 January 2016

九章的系统设计班 http://www.jiuzhang.com/course/2

系统设计班

你是否已经刷过很多算法,能顺利写出基本算法,但Dream company的面试,却倒在高难度的System Design问题?你是否有了一定工作基础,对基本题已经失去兴趣,但仰望进入更有挑战更NB的公司?九章算法系统设计班,会为你提供一线情报和实战训练,全面提升你的综合素质,我们会竭力帮助你临门一脚,改变你的人生!

课程优势

在线视频授课。无需东奔西走求学,浏览器,手机端,无需预装任何软件即可上课。课堂实时互动。学生可以向老师提问,及时解决课堂上的疑问。
谁来讲:张无忌老师。北京大学博士。前阿里巴巴高级专家(P8);组建50人团队,研发千万级用户平台,实现千万级收入。

讲什么:通过实战面试真题,揭秘数据系统设计、web系统设计、爬虫系统设计、消息系统设计、统计系统设计的面试技巧。助你临门一脚,拿到更好的offer!课程的题目不仅是面试真题,更是在实际工程中会遇到的真实问题。因此不仅有利于面试,更有利于同学的多快好省地完成实际工作。

适合谁:无需任何算法基础和系统设计基础。适合北美和国内的应届毕业生,非应届毕业生,特别是已经有工作经验准备跳槽的工程师。课程也适合于想要在公司内快速晋升的工程师,将所学应用在日常的工作和晋升答辩之中。

本课程每节课2小时,共7节课

授课内容

1

走进系统设计 Introduce System Design 【免费试听】


本节大纲
  • 什么是系统设计?
  • 如何高效并有效的备考面试?《结构化面试指南》
  • 如何回答系统设计题?精讲一道真题让学员轻松pass一半以上的系统结构面试——如何设计Uber

2

数据系统设计 Database System Design


本节大纲
  • 实战真题
        1)Design data with class and inheritance
        2)Design user system
        3)Design payment system
  • 掌握数据系统设计的知识和方法论    
  • 关键词:database, primary/foreign key, table, row, attribute, index, transaction, log, lock, lifecycle graph, binary search tree, B+Tree, atomicity, consistency, isolation, durability, session

3

爬虫系统设计 Web Crawler System Design


本节大纲
  • 实战真题
        1)Design a crawler
        2)Design thread-safe consumer and producer
        3)Design Tiny URL
  • 掌握爬虫系统设计的知识和方法论
  • 关键词:crawler , HTTP, socket, HTML , TCP, thread, lock, conditional variable, semaphore, tiny URL, insert, lookup, encoding, producer&consumer

4

网站应用系统设计 Web Application System Design


本节大纲
  • 实战真题
        1)What happens when you visit google?
        2)How to increase the visiting speed of a page?
        3)Design rate limiter
  • 掌握Web系统设计的知识和方法论
  • 关键词:failure rate, DNS, web server, file server, timeout, content delivery network, cookie, HTTP, divide and conquer, Internet service provider, hosts, hijack, retention rate, cache, lazy load, rate limiter, QPS, counter , expire, request list, token bucket

5

分布式系统设计 Distributed System Design


本节大纲
  • 实战真题
        1)What is GFS, BigTable, MapReduce?
        2)Design distributed file system
        3)Design distributed database
        4)Calculate with MapReduce
  • 掌握分布式系统设计的知识和方法论
  • 关键词:Google File System, BigTable, MapReduce, bloom-filter, index, log, hot spot, replication, read/write, chunk, traffic, key/value

6

订阅和消息系统设计 Feed & Message System Design


本节大纲
  • 实战真题:
        1)Design feed system
        2)Design chat system
  • 掌握订阅和消息系统设计的知识和方法论
  • 关键词:feed, pull, push, search, chat, SQL, NoSQL, memory database, broadcast, batch, compression, delay, WebSocket, online, timeline

7

面向对象系统设计 OOD System Design


本节大纲
  • 实战真题:
        1)Design typeahead
        2)Design elevator
        3)Design parking lot
        4)Design blackjack
        5)Design Achievement system
  • 掌握面向对象设计的知识和方法论
  • 关键词:typeahead, instant search, binary search tree, trie tree, ternary search tree, like, cache, friend of friend, log, OOD, class, know/do/is, elevator, parking lot, card game, achievement system