Big O Notation
Say you order Harry Potter: Complete 8-Film Collection [Blu-ray] from Amazon and download the same film collection online at the same time. You want to test which method is faster. The delivery takes almost a day to arrive and the download completed about 30 minutes earlier. Great! So it’s a tight race.
What if I order several Blu-ray movies like The Lord of the Rings, Twilight, The Dark Knight Trilogy, etc. and download all the movies online at the same time? This time, the delivery still take a day to complete, but the online download takes 3 days to finish.
For online shopping, the number of purchased item (input) doesn’t affect the delivery time. The output is constant. We call this O(1).
For online downloading, the download time is directly proportional to the movie file sizes (input). We call this O
.
From the experiments, we know that online shopping scales better than online downloading. It is very important to understand big O notation because it helps you to analyze the scalability and efficiency of algorithms.
Note: Big O notation represents the worst-case scenario of an algorithm. Let’s assume that O(1) and O
are the worst-case scenarios of the example above.
Recursion
Someone in a movie theater asks you what row you’re sitting in. You are too lazy to count, so you ask the person in front of you. You simply have to add 1 from the person’s answer to get your current row number. Brilliant right? However, the person in front of you did exactly the same thing, and so on. Finally the question reaches row 1 and he answers: “I’m in row 1!”. From there, the correct message (incremented by one each row) will pass all the way up to the person who asked.
Aaron Krolik/Quora
Here’s another example known as the Droste effect. A nurse is carrying a tray with a box of cocoa and a cup containing a smaller image of her holding the same thing, which in turn contains an even smaller version of the image, and so on.
Big Data
Let’s assume you have a leak in a water pipe in your garden. You take a bucket and some sealing materials to fix the problem. After a while, you see that the leak is much bigger that you need a plumber to bring bigger tools. In the meanwhile, you are still using the bucket to drain the water. After a while, you notice that a massive underground stream has opened. You need to handle gallons of water every second.
Buckets aren’t useful anymore. You need a completely new approach to solve the problem because the volume and velocity of water has grown. To prevent the town from flooding, you may need the government to build a massive dam that requires an enormous civil engineering expertise and an elaborate control system.
Balaji Viswanathan/Quora
Big data describes data sets so large and complex that is impossible to manage with conventional data processing tools.
Greedy Algorithm
Imagine you are going for hiking and your goal is to reach the highest peak possible. You already have the map before you start, but there are thousands of possible paths shown on the map. You are too lazy and simply don’t have the time to evaluate each of them. Screw the map! You started hiking with a simple strategy – be greedy and short-sighted. Just take paths that slope upwards the most.
After the trip ended and your whole body is sore and tired, you look at the hiking map for the first time. Oh my god! There’s a muddy river that I should’ve crossed, instead of keep walking upwards.
A greedy algorithm picks the best immediate choice and never reconsiders its choices.
Hill Climbing
This time you’re climbing another hill. You’re determined to find the path that will lead you to the highest peak. However, there’s no map provided and it’s very foggy. To make your trips easier, you have downloaded a hiking app that track paths you’ve taken and measures your current altitude.
You climb the hill over and over again. Each time, you take the exact same path that leads you to the highest peak ever recorded, but somewhere in the middle of your journey, you choose a slightly different route.
You can also randomly choose a different starting point, which is known as random-restart hill climbing. So that you don’t just linger around the same area and reduce your probability of getting stuck.
The hill climbing algorithm attempts to find a better solution by generating a neighboring solution. Each neighboring solution is generated based on the best solution so far, with a single element modified.
Say you order Harry Potter: Complete 8-Film Collection [Blu-ray] from Amazon and download the same film collection online at the same time. You want to test which method is faster. The delivery takes almost a day to arrive and the download completed about 30 minutes earlier. Great! So it’s a tight race.
What if I order several Blu-ray movies like The Lord of the Rings, Twilight, The Dark Knight Trilogy, etc. and download all the movies online at the same time? This time, the delivery still take a day to complete, but the online download takes 3 days to finish.
For online shopping, the number of purchased item (input) doesn’t affect the delivery time. The output is constant. We call this O(1).
For online downloading, the download time is directly proportional to the movie file sizes (input). We call this O
From the experiments, we know that online shopping scales better than online downloading. It is very important to understand big O notation because it helps you to analyze the scalability and efficiency of algorithms.
Note: Big O notation represents the worst-case scenario of an algorithm. Let’s assume that O(1) and O
Recursion
Someone in a movie theater asks you what row you’re sitting in. You are too lazy to count, so you ask the person in front of you. You simply have to add 1 from the person’s answer to get your current row number. Brilliant right? However, the person in front of you did exactly the same thing, and so on. Finally the question reaches row 1 and he answers: “I’m in row 1!”. From there, the correct message (incremented by one each row) will pass all the way up to the person who asked.
Aaron Krolik/Quora
Here’s another example known as the Droste effect. A nurse is carrying a tray with a box of cocoa and a cup containing a smaller image of her holding the same thing, which in turn contains an even smaller version of the image, and so on.
Big Data
Let’s assume you have a leak in a water pipe in your garden. You take a bucket and some sealing materials to fix the problem. After a while, you see that the leak is much bigger that you need a plumber to bring bigger tools. In the meanwhile, you are still using the bucket to drain the water. After a while, you notice that a massive underground stream has opened. You need to handle gallons of water every second.
Buckets aren’t useful anymore. You need a completely new approach to solve the problem because the volume and velocity of water has grown. To prevent the town from flooding, you may need the government to build a massive dam that requires an enormous civil engineering expertise and an elaborate control system.
Balaji Viswanathan/Quora
Big data describes data sets so large and complex that is impossible to manage with conventional data processing tools.
Greedy Algorithm
Imagine you are going for hiking and your goal is to reach the highest peak possible. You already have the map before you start, but there are thousands of possible paths shown on the map. You are too lazy and simply don’t have the time to evaluate each of them. Screw the map! You started hiking with a simple strategy – be greedy and short-sighted. Just take paths that slope upwards the most.
After the trip ended and your whole body is sore and tired, you look at the hiking map for the first time. Oh my god! There’s a muddy river that I should’ve crossed, instead of keep walking upwards.
A greedy algorithm picks the best immediate choice and never reconsiders its choices.
Hill Climbing
This time you’re climbing another hill. You’re determined to find the path that will lead you to the highest peak. However, there’s no map provided and it’s very foggy. To make your trips easier, you have downloaded a hiking app that track paths you’ve taken and measures your current altitude.
You climb the hill over and over again. Each time, you take the exact same path that leads you to the highest peak ever recorded, but somewhere in the middle of your journey, you choose a slightly different route.
You can also randomly choose a different starting point, which is known as random-restart hill climbing. So that you don’t just linger around the same area and reduce your probability of getting stuck.
The hill climbing algorithm attempts to find a better solution by generating a neighboring solution. Each neighboring solution is generated based on the best solution so far, with a single element modified.
