Wednesday 10 September 2014

System Design & Analysis (6) Git TBA

Git is my favorite Version Control System. It is stable, fast and opensource.

Basic:
1) It is designed by the super famous guy Linus Benedict Torvalds.
2) It is the most widely adopted version control system for software development.
3) Git is free software distributed under the terms of the GNU General Public License version 2.

Development Platform:
1) Git is primarily developed on Linux.

Project Target:
1) Designed for large projects.
2) A core assumption in Git is that a change will be merged more often than it is written, as it is passed around various reviewers.
3) Localized fast-commit, without internet.

Designed Operations:



















TBA

Success?
Yes, see the user list:







Tuesday 9 September 2014

System Design & Analysis (5) PARAMICS

PARAMICS is a traffic microsimulation software developed by Quadstone Paramics. It is also the first traffic simulation software I have used. Instead of a software, PARAMICS is more like a ecosystem. It contains everything from network Editor, Visualizer, DTA, Parallel Platform, Data Analysis, Reporting Auto-generation and Video-capturing.

Basic:
1) Originally established in the early 1990s by the UK Department for Transport.
2) The software is widely used by government agencies, consultancies and universities. 

Development Platform:
C++
Excellent Core + Plugin System

Project Target:
One-stop microsimulation software

Platform Design:
1) Modeller: Build and Run a microsimulation.
2) Processor: A tool that allows simulations to be run on any number of networked PCs.
3) Analyser: A toolkit for post simulation data analysis.
4) Estimator: for OD matrix estimation in microsimulation level.
5) Converter: Network data format converter.
6) Monitor: Evaluate the levels of traffic emission pollution.
7) Programming: Enable programmers to control using C++.
8) PPM: Pedestrian Module
9) Designer: 3D Studio

Key Concepts: (Nothing New but Productive & Professional)
1) OD Matrix
2) 3-D
3) Behavior Models: Car-Following & Lane Changing
4) OD Estimation
5) Report Auto-generation
6) Parallel Computing

GUI:




























Recent Updates:
1) Support Environment and Emissions
2) Support Pedestrian Movement

Success?
Yes, It is a big success!
Compared to tens of other simulation products, Paramics is extremely successfully. 

Future
The core of Paramics is computational intensive and is not suitable for city-scale traffic simulation.
The current Paramics suit does not quite fit in the area of city-scale traffic simulations.
Something should be changed.

Monday 1 September 2014

System Design & Analysis (4) SCATS

SCATS® (Sydney Coordinated Adaptive Traffic System) is an adaptive urban traffic management system that synchronises traffic signals to optimise traffic flow across a whole city, region or corridor.

Basic:
1) developed in Sydney, Australia by former constituents of the Roads and Maritime Services in the 1970s.
2) about 34,350 intersections in over 154 cities in 25 countries use the system. For example:
    Singapore: GLIDE
    Canberra "CATSS" (Canberra Automated Traffic Signal System)
    Melbourne "SCRAM"
    Perth "PCATS"
3) SCATS is already a recognised worldwide market leader in intelligent transport systems.

Development Platform:
1) It is a distributed system. Intersections are connected over PAPL, ADSL, PSTN.
2) Data is collected from loop detectors in intersections.
3) Signal time planing depends on real-time collected data. (Adaptive) The process happens in local PCs.

Project Target:
1) Provide an adaptive and comprehensive signal control application. 

Key Concepts:
1) Interval
2) Phase
3) Phases (Phase Organization)
4) Signal Planing (Phase Timing)
5) Sub-System (Master Signal & Slave Signal)
6) Priority Vehicles (Dynamic "Phase Organization" or Dynamic "Phase Timing")

GUI:
Since it is a command system, there is no GUI.
AN example control panel:
















Recent Updates:
SCATS is recently deployed in New Zealand, Hong Kong, Shanghai, Guangzhou, Amman, Tehran, Dublin, Rzeszów, Gdynia.

Success?
Yes, SCATS is a very successful system.
Designed in 1970s, even older than Windows, SCATS is still widely used in cities.

The success is based in:
1) SCATS succeed to implement its project target. 
2) SCATS is extremely stable.
3) No competitors.

System Design & Analysis (3) Waze

2010, in a NUS class "Intelligent Transportation System", I introduced Waze to classmates in an assignment. At that time, it is a startup software, which has <100 users in Singapore.
2013, Google completed the acquisition of Waze for a reported US$1.3 billion. There are only 100 employees.

Basic:
1) community-driven navigation application
2) only on mobility phones
3) real-time information sharing software
4) location-based services, e.g. gas

Development Platform:
Various: iPhone and Android (majorly)

Project Target:
Solve the challenges (high map update cost and information delay) by using communities.
1) Community activeness is the key and the most valuable in this project.
2) Everything (GUI, Function and Fun) is about user activity.

System Design & Key Concepts:
1) Turn-by-turn Navigation
2) User Trajectory&Speed Uploading (updating map)
3) Real-time Road Information Update
4) User Activeness

GUI:



















Recent Updates:
*In 2011 Waze Mobile updated the software to display real-time, community-curated points of interest.
*In June 2012 Waze launched an update to provide real-time fuel prices.
*In June 2013, Waze introduced a global localization project that will enable future road closures and real-time traffic updates.

Success?
Yes, it is a very successful project. The success is based in:
1) Its fundamental idea to do it is great (Great Idea) and hard to achieve (Great Leader) and hard to compete (even Google has no confident to beat it). 
2) It captures user activeness. (Great Designer)

However, it meet with the monetizing problem, which is everywhere in map apps (even Google Map, Baidu Map, and off course Apple Maps and Microsoft Maps). 

Bought by Google, I think it is double-benefits:
For Waze, it can use Google Ad to get incomes (Location-based Ads) from its massive users.
For Google, it can avoid another competitor. (1B is a bit too much)

Future
In my view, Waze has achieved what it planned to do. 
Its functions are complete.
The future is to make it world-widely used and making money. 

Finally
This post focuses on high-level system design. So, the user activeness attraction is not included.

Friday 29 August 2014

System Design & Analysis (2) Microsoft Visio

Microsoft Visio 2007 is one of my favorite software. I used it to draw flow charts, diagrams and to build GUI prototypes.

Basic:
1) a drawing software to draw everything in various industry.
2) a commercial software by Microsoft to enrich Office. Thus, Visio is plugin-able in Words, PowerPoint, etc.
3) It is 2-D drawing software.
4) It was first introduced in 1992, made by the Shapeware corporation. It was acquired by Microsoft in 2000.

Development Platform:
1) Programming language: unknown.

System Design: 
Using target: create any drawing in 3 basic steps
1) Choose and open a template
2) Drag and connect shapes
3) Add text to shapes

Key Concepts:
1) Stencils and templates: a group of shapes related with a topic.
2) Shapes: the basic unit that is copy-able and edit-able.
3) Data: associate data with shapes. Different shapes might have specific data and data view.

An example of stencils:














Key Product Features:
1) extremely extendable by new stencils, new shapes and new data
2) support programming
3) work with Word, PowerPoint, Access, OneNote, Access, Excel

GUI:
















Recent Versions: (still not a default unit in Office)
  • Office Visio 2007 (v12.0; Standard, Professional)
  • Visio 2010 (v14.0; Standard, Professional, Premium)
  • Visio 2013 (v15.0; Standard, Professional)

Success?
1) From user's point of view, VISIO is professional, powerful and user-friendly.
2) However, it is also expensive, heavy, and looks not easy to use (even through it is not).
3) From product point of view, VISIO does not get that success as Word and Excel. 
    For two major reasons: 
    A: VISIO itself is (or feels) heavy, both product and expense.
    B: Microsoft did not make VISIO a default unit, and enhanced the draw features of PowerPoint. It makes VISIO not a must for non-professional users. In my view, instead of simplify VISIO for normal users (like: make stencils plugin-able and simplify the GUI), Microsoft decided to borrow VISIO to PowerPoint and leave VISIO to rich professionals and industry users.

Tuesday 26 August 2014

Professional Research Engineer

I like the role as a "Research Engineer". It feels like the role gives me opportunity to do both research and software engineering. Then, how to behave as a Professional one? 

This is the To-Do and Not-Do list. Just for my first Job.

1) To-Do
    A: Do what I believe and then Attitude of making it happen.
    A: Transparency, openness and genuineness.
    B: Save others' time and efforts.
    C: Take notes of my promises. That's a BIG mark.
    D: Aim to be professional: both in hard skills and soft skills. Professional to teammates, not friendly to teammates.
    E: Build trust and respect in career.
    F: To open and aware.
    G: keeping track of career accomplishments.
    H: Learn from teammates.
    I: Be easy to work with.

2) Not-Do
    A: Multi-tasking too much, or Not-prioritizing
        Long-term Targets
        Short-term Works, like: meetings, talks and reports.
    B: Complaining
        Optimistic and Solving
        Release angry on things, not on people.
    C: Making promises you can’t keep !!!
         Be serious! Save my own time !!! 
         Trust is more important than promises.
    D: In charge of others' business.
         Think about Dhinesh. Do not do that, please.
    E: Afraid of mistakes.
         Everyone made mistakes, calm down and avoid the same mistake.
         Take the responsibility only if it is my mistake.
    F: Over-work
         It always (90%) does not solve the problems.

Finally, Rules above all, Be Confident and Positive.

Tuesday 12 August 2014

职业生涯(1)

即将开始自己的第一份工作,
作为一个PhD,进入CS Industry,
想一下PhD这5年,
费了半天劲才拿到一个PhD 学位,
5年的时光,转瞬就过去了,
翻看一下Linkedin上CS PhD的职业生涯,
也算是一件很有意思的事情,
Startup,Director,Professor,Manager,Analyst,Scientist,
好多眼花缭乱的头衔,
Where is my Passion?
说不清楚,应该是介于Manager和Analyst之间,
我喜欢Software,Numbers,Algorithms, and Communicate with Teammates and Clients.
就我现在的能力而言,
编程能力已经足够自信,
算法能力也足以设计大多数的系统,
沟通能力急需改善,这也是面试Google得到的教训,清晰沟通的能力和编程能力一样重要,为此,我打算在工作前报一个澳洲的口语学习班;尽量面对面的那种;
系统设计能力不足,这是不注意观察和讨论的结果,身边的诸多系统的设计原理,我都比较陌生,诸如:Github, Ubuntu, Eclipse, G++, Qt。 对此,我打算定期学习一些身边的系统,然后写一些博客,有空的时候和朋友沟通,
这应该是我对自我职业生涯现状的认识,
具体的技术如data mining,image processing,high frequency trading,UI,website development, mobile network 等也存在很大的差别,根据最近的面试经历而言,每一个行业都存在一定的行业壁垒,
诸如PR身份的问题也是职业发展的一个重要因素,也应该在考虑范围之内,
对于时间协调,人员协调,矛盾化解,这都是我比较陌生的问题,暂时还没有打算处理这些问题,多多观察身边的事情就是了,
试着站在一个更高的角度上看现在的自己,
我对自己的定位是technical problem observer and solver, 不局限于具体的行业或者技术,发挥自己的积极能动和思考能力,为自己争取机会,
根据过去的情况来看,我犯过的错误包括:
没有二次检查重要试验结果,Less Communication,自我定位,
过去犯过的错误应该避免,我的性格不缺乏士气,但易于冒进

Thursday 7 August 2014

System Design & Analysis (1) Octave

Octave
http://en.wikipedia.org/wiki/GNU_Octave

Basic:
1) free software under the terms of the GNU General Public License.
2) a high-level programming language.
3) a replacement of MATLAB.
4) primarily intended for numerical computations, like: regression, plot.

Development Platform:
1) written in C++ using the C++ standard library.
2) uses an interpreter to execute the Octave scripting language.
3) works with gnuplot and Grace software to create plots, graphs, and charts, and to save or print them.
4) later include a Graphical User Interface (GUI) in addition to the traditional Command Line Interface (CLI). GUI developed using Qt.

Key Product Features:
1) Support key things a programming language, e.g. variables, functions, comments.
2) Large number of supports on matrices, built-in math functions.
3) Octave treats incompatibility with MATLAB as a bug, interesting.

GUI:

















Open Source Update:
2011-2012:
about 90 di erent contributors made 2917 commits, 57 of them made only 1 or 2 changes, the top 5 contributors made 2377 commits.

Incomes:
A few large donations from a small number of benefactors
Many small donations from users

Limitation:
1) Interaction with C++ and Java. In many cases, we want to use Octave code as a small part in a large project.
2) No Object concept. But I do not think it is a big limitation currently.

Success?
From users' point of view, Octave achieves its design goal as a GNL free software to replace the expensive MATLab, it is a success.
However, the developer team do need to find more ways to sell Octave.

Thursday 26 June 2014

Mathematical Problems in Algorithms (Summary)

Source: http://help.topcoder.com/data-science/competing-in-algorithm-challenges/algorithm-tutorials/mathematics-for-topcoders/

1: Prime Check

public boolean isPrime (int n)
{
   if (n<=1) return false;
   if (n==2) return true;
   if (n%2==0) return false;
   int m=Math.sqrt(n);

   for (int i=3; i<=m; i+=2)
      if (n%i==0)
         return false;

   return true;
}
 
2: Get All Primes
 
public boolean[] sieve(int n)
{
   boolean[] prime=new boolean[n+1];
   Arrays.fill(prime,true);
   prime[0]=false;
   prime[1]=false;
   int m=Math.sqrt(n);

   for (int i=2; i<=m; i++)
      if (prime[i])
         for (int k=i*i; k<=n; k+=i)
            prime[k]=false;

   return prime;
}
 
3: greatest common divisor (GCD)
Euclid's algorithm 

//assume that a and b cannot both be 0
public int GCD(int a, int b)
{
   if (b==0) return a;
   return GCD(b,a%b);
} 

4: lowest common multiple (LCM)
 
public int LCM(int a, int b)
{
   return b*a/GCD(a,b);
}
 
5: Rectangles Overlap
 
Suppose we have two rectangles R1 and R2. Let (x1, y1) be the location 
of the bottom-left corner of R1 and (x2, y2) be the location of its 
top-right corner. Similarly, let (x3, y3) and (x4, y4) be the respective
 corner locations for R2. The intersection of R1 and R2 will be a 
rectangle R3 whose bottom-left corner is at (max(x1, x3), max(y1, y3)) 
and top-right corner at (min(x2, x4), min(y2, y4)). If max(x1, x3) > 
min(x2, x4) or max(y1, y3) > min(y2, y4) then R3 does not exist, ie 
R1 and R2 do not intersect. 

6: Fractions
 
public int[] multiplyFractions(int[] a, int[] b)
{
   int[] c={a[0]*b[0], a[1]*b[1]};
   return c;
} 

Wednesday 25 June 2014

What happened in Google Search?

Source:http://www.youtube.com/watch?v=modXC5IWTJI&feature=player_embedded






































Why Gmail does not improve its left-side bar? I use it hundreds of time every week and suffer hundreds of time every week


How to find a Solution?

Source: How To Find a Solution http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=findSolution

1)
Straight-forward problems that don't require any special technique (e.g. simulation, searching, sorting, BST etc.)

2) 
Breadth First Search (BFS)
Problems that use BFS usually ask to find the fewest number of steps (or the shortest path) needed to reach a certain end point (state) from the starting one. & all of edges having the same cost of 1.

Example Leetcode:
1: Word Ladder

3)
Flood Fill
Sometimes you may encounter problems that are solved by the help of Flood Fill, a technique that uses BFS to find all reachable points. 

Example Leetcode:
1: Surrounded Regions
2: Binary Tree Level Order Traversal

4)
Brute Force
How does a brute force algorithm work? Actually, it tries all possible situations and selects the best one. It's simple to construct and usually simple to implement. If there is a problem that asks to enumerate or find all possible ways (situations) of doing a certain thing, and that doesn't have high limits - then it's most probably a brute force problem.

5)
Dynamic Programming
1: Optimal Subproblem
2: Repeated Computing

AND both the search space and the target are integers!!

Example Leetcode:
1: Unique Paths I & II
2: Climbing Stairs
3: Decode Ways

6) 
Greedy Algorithm
1: Optimal Subproblem
2: Repeated Computing
3: Greedy Property

7)
Recursion Method
Tree, String, Array

Example Leetcode:
1: Scramble String

8)
N-Scan Method

  1. 2-round Scan
  2. Fast-Slow Scan
  3. Reversed Scan

9) 
Linear Programming
1: Objective Function
2: Constraints


Wednesday 11 June 2014

Coding Strategy of Leetcode

Reverse Words in a String
Idea: Double Reverse

Evaluate Reverse Polish Notation
Idea: Typical Last-in-First-out problem, so should use Stack

Word Break I
Idea: DP
CanBreak[i] = CanBreak[k] && word.substr(k+1, i) in the Dictionary 

Word Break II
Idea: DP
Redundancy: Repeated checking all possibilities in a place
Solution: Pre-Calculate the list

Idea: Slope -> HashMap
Careful: Vertical - Same x location;
Careful: Same Location: Repeated node;

Sort List
Idea: Merge Sort O(nlgn) and Constant Space
Careful: define new head node when merging;


Insertion Sort List

 
Idea: find the location from the head and always keep its previous node;

LRU Cache

Idea: hashmap + list
get: O(1) need a Hashmap<key, Object*>
       Also, need to insert to front of the list;
set: O(1) need a Hashmap and then add to top or re-insert to top;

Binary Tree Postorder Traversal
Idea: Magic two-stacks version
Or
Idea: one-Stack-prev-curr version

Binary Tree Preorder Traversal

Idea: Stack + Right-First-In-Stack

Binary Tree Inorder Traversal
Idea: Code attached.
while(current != NULL || mystack.empty() == false) {
if(current != NULL) {
mystack.push(current);
current  = current -> left;
}
else {
printf("%d,", mystack.top()->key);
current  = mystack.top() -> right;
mystack.pop();
}
}

Reorder List
Idea: partial re-order and then merge lists

Linked List Cycle I && II

 
Idea: fast-slow traversal algorithms

Copy List with Random Pointer
Idea: std::unordered_map<RandomListNode*, RandomListNode*> a_hash_map

Single Number I
Idea: Feature of bitwise operation XOR: cancel each other

Single Number II
Idea: Count and check the sum of each bit % 3, is 0 or not? 
If Yes, the missing number has 0 in the location, If No, set the location as 1. like: n = n | (1 << l)

Idea: Two-round Scanning and Double-check correctness

Gas Station
Idea: Greedy Search the first location until the sum to end is always positive

Clone Graph

Idea: std::unordered_map<RandomListNode*, RandomListNode*> a_hash_map

Palindrome Partitioning

Idea: DP & Pre-Calculating Post-ward Palindrome Array

Palindrome Partitioning II
Idea: Same with I but keep a updated parameter

Surrounded Regions

Idea: Queue & O -> Y firstly and then O -> X && Y -> O;


Idea: Recursive Method

Longest Consecutive Sequence
Idea: std::unordered_set & check surrounding

Word Ladder
Idea: shortest path

Word Ladder II
Idea: output all paths (loops processing) + std::unordered_set<bool> processed 

Valid Palindrome
Idea: Clean String and Scan

Best Time to Buy and Sell Stock
Idea: maintain the min value and max profit

Best Time to Buy and Sell Stock II
Idea: Aggregate all benefits

Best Time to Buy and Sell Stock III
Idea: DP and two-round scan


Triangle

Idea: Recursive Algorithm and re-used vector (Too Costly)
Idea: Post-Order Scan


Pascal's Triangle I & II

Idea: Purely Logic, no complex algorithm

Populating Next Right Pointers in Each Node I & II
Idea: BFS, level by level processing; Instead of using transitional queues, use pointers and next; 

Flatten Binary Tree to Linked List
Idea: Post-order processing and connect left to right

Path Sum I & II
Idea: Recursive Algorithm

Minimum Depth of Binary Tree 
Idea: Recursive Algorithm and Cut Branch

Balanced Binary Tree
Idea: Recursive Get Height and Compare. (BFS, DFS [P/I/P-order] are mostly enough)


Convert Sorted Array to Binary Search Tree

 
Idea: Recursively find the middle and new a node;

Convert Sorted List to Binary Search Tree
Idea: Build a BSF with N nodes and then set the value using In-order traversal method

Binary Tree Level Order Traversal I && II
Idea: use two-queue method; Reverse the vector in the end;


Construct Binary Tree from Inorder and Postorder Traversal

 

Construct Binary Tree from Preorder and Inorder Traversal

 
Idea: Recursive Method and HashTable for seaching locations

Construct Binary Tree from given preorder and postorder traversals

No unique tree if no more constraints.

Maximum Depth of Binary Tree
Idea: Recursive method

Binary Tree Zigzag Level Order Traversal
Idea: Two stacks + Flag: leftToRight

Symmetric Tree
Idea: Level by Level + Check Symmetric 

Same Tree
Idea: Post-Order Scan

Recover Binary Search Tree
Idea: 
1: In-Order Traversal -> Find the first node violating ascending order
2: Reversed In-Order Traversal -> Find the first node violating descending order
3: Swap

Interleaving String
Idea: Recursive Algorithm is too costly, for cases like:s1=aa*, b=aa*, c=aaaa**;
Idea: Use 2-D Dynamic Programming; 
M[i][j] = (M[i][j-1] && b[j] == c[i+j]) || (M[i-1][j] && a[i] == c[i+j]);

Unique Binary Search Trees I && II
Idea: Recursively pick the middle node; For II, copy for each branch;

Restore IP Addresses
idea: BFS and cut branch
Reverse Linked List II
Idea: Find and Reverse, careful the border cases

Subsets I && II
Idea: Sort and then 2-level Scan

Decode Ways
Idea: BFS but careful on the special cases, 0

Merge Sorted Array
Idea: Scan the array from the back to the front

Scramble String

Idea: DFS and cut branch

Partition List
Idea: maintain a pointer to the next place to insert, like quick sort.

Maximal Rectangle
Idea: Pre-Process 2-D Matrix to keep the number of 1s in the same row before it.

Largest Rectangle in Histogram
Idea 1: Surrounding Search O(n^2)
Idea 2: Divide and Conquer O(nlgn);
Idea 3: Stack O(n)

Remove Duplicates from Sorted List I && II
Idea: Logic, no algorithms

Sort Colors
Idea 1: Counting Sort
Idea 2: Swap, but be careful on the left side and right side (different)
Search a 2D Matrix
Idea: Two Binary Search, care special cases.

Set Matrix Zeroes
Idea: Two vectors to keep the rows and cols that need to set to 0.

Climbing Stairs
Idea 1: BFS, but costly, O(2^n)
Idea 2: DP. N[n] = N[n-1] + N[n2], O(n)
Idea 3: Fibonacci Optimization Algorithm, O(lgn) 

Sqrt(x)

Idea: Binary Check (min, max), careful overflow.

Plus One
Idea: addition (a, b, carry_bit), no algorithm

Add Binary
Idea: Similar to Plus One

Merge Two Sorted Lists
Idea: Logic, no algorithm

Spiral Matrix II
Idea: Spiral Scan Matrix, Each Time Scan 4 Border

Pow(x, n)
Idea: Recursive and Re-use

Rotate Image
Idea: Spiral Scan Matrix, Each Time Scan 4 Border

Permutations

 
Idea: swap for each possibility

First Missing Positive
idea: swap, careful special cases, duplicate values.

Count and Say
Idea: Logic, no algorithm
Remove Element
Idea: Logic, no algorithm
Remove Duplicates from Sorted Array
Idea: Similar with Remove Element

Valid Parentheses
Idea: Stack

Remove Nth Node From End of List
Idea: two-index method

3Sum & 3Sum Closest
Idea: Sort and Hash; Sort and 2-Index Scan

Longest Common Prefix

Idea: Logic, No Algorithm

Roman to Integer
Idea: Logic, No Algorithm

Regular Expression Matching
Idea: Star or not

Palindrome Number
Idea: First (%(10^Bits)) == Last (&1)

String to Integer (atoi)
Idea: Logic, Many Special Cases

Reverse Integer
Idea: Logic, care special cases, overflow and 0.

Longest Palindromic Substring
Idea: Pick any element as a center

Add Two Numbers
Idea: Logic, no algorithm

Longest Substring Without Repeating Characters
Idea: hash_map and maintain the max_length

Median of Two Sorted Arrays
Idea: Recursive Algorithm