COURSES INCLUDED

1.

COURSE 1

Algorithmic Toolbox

The course covers basic algorithmic techniques and ideas for computational problems arising frequently in practical applications: sorting and searching, divide and conquer, greedy algorithms, dynamic programming. We will learn a lot of theory: how to sort data and how it helps for searching; how to break a large problem into pieces and solve them recursively; when it makes sense to proceed greedily; how dynamic programming is used in genomic studies. You will practice solving computational problems, designing new algorithms, and implementing solutions efficiently (so that they run in less than a second).

WEEK 1

Welcome

Welcome to the first module of Data Structures and Algorithms! Here we will provide an overview of where algorithms and data structures are used (hint: everywhere) and walk you through a few sample programming challenges. The programming challenges represent an important (and often the most difficult!) part of this specialization because the only way to fully understand an algorithm is to implement it. Writing correct and efficient programs is hard; please don’t be surprised if they don’t work as you planned—our first programs did not work either! We will help you on your journey through the specialization by showing how to implement your first programming challenges. We will also introduce testing techniques that will help increase your chances of passing assignments on your first attempt. In case your program does not work as intended, we will show how to fix it, even if you don’t yet know which test your implementation is failing on.

  • 01. Welcome!
  • 02. Overview
  • 03. Available Programming Languages
  • 04. Programming Assignment • A plus B
    • 01. Solving the Problem (screencast)
    • 02. Programming Assignment • Maximum Pairwise Product
  • 05. Practice Quiz • Solving Programming Assignments
  • 06. Practice Quiz • Solving Programming Assignments
    • 01. Solving the Problem: Improving the Naive Solution, Testing, Debugging
    • 02. Stress Testing: the [Almost] Silver Bullet for Debugging
    • 03. Stress Test – Implementation
    • 04. Stress Test – Find the Test and Debug
    • 05. Stress Test – More Testing, Submit and Pass!
    • 06. FAQ on Programming Assignments
  • 07. Practice Quiz • Solving Programming Assignments
WEEK 2

Introduction

In this module you will learn that programs based on efficient algorithms can solve the same problem billions of times faster than programs based on naïve algorithms. You will learn how to estimate the running time and memory of an algorithm without even implementing it. Armed with this knowledge, you will be able to compare various algorithms, select the most efficient ones, and finally implement them as our programming challenges!

  • 01. Why Study Algorithms?
  • 02. Coming Up
  • 03. Problem Overview
  • 04. Naive Algorithm
  • 05. Efficient Algorithm
  • 06. Resources
  • 07. Problem Overview and Naive Algorithm
  • 08. Efficient Algorithm
  • 09. Resources
  • 10. Computing Runtimes
  • 11. Asymptotic Notation
  • 12. Big-O Notation
  • 13. Using Big-O
  • 14. Resources
  • 15. Quiz • Logarithms
  • 16. Quiz • Big-O
  • 17. Quiz • Growth rate
    • 01. Course Overview
  • 18. Programming Assignment 1: Introduction
WEEK 3

Greedy Algorithms

In this module you will learn about seemingly naïve yet powerful class of algorithms called greedy algorithms. After you will learn the key idea behind the greedy algorithms, you may feel that they represent the algorithmic Swiss army knife that can be applied to solve nearly all programming challenges in this course. But be warned: with a few exceptions that we will cover, this intuitive idea rarely works in practice! For this reason, it is important to prove that a greedy algorithm always produces an optimal solution before using this algorithm. In the end of this module, we will test your intuition and taste for greedy algorithms by offering several programming challenges.

  • 01. Largest Number
  • 02. Car Fueling
  • 03. Car Fueling – Implementation and Analysis
  • 04. Main Ingredients of Greedy Algorithms
  • 05. Quiz- Greedy Algorithms
    • 01. Celebration Party Problem
    • 02. Efficient Algorithm for Grouping Children
    • 03. Analysis and Implementation of the Efficient Algorithm
    • 04. Long Hike
    • 05. Fractional Knapsack – Implementation, Analysis and Optimization
    • 06. Review of Greedy Algorithms
    • 07. Resources
  • 06. Quiz- Fractional Knapsack
  • 07. Programming Assignment 2: Greedy Algorithms
WEEK 4

Divide-and-Conquer

In this module you will learn about a powerful algorithmic technique called Divide and Conquer. Based on this technique, you will see how to search huge databases millions of times faster than using naïve linear search. You will even learn that the standard way to multiply numbers (that you learned in the grade school) is far from the being the fastest! We will then apply the divide-and-conquer technique to design two efficient algorithms (merge sort and quick sort) for sorting huge lists, a problem that finds many applications in practice. Finally, we will show that these two algorithms are optimal, that is, no algorithm can sort faster!

  • 01. Intro
  • 02. Linear Search
  • 03. Binary Search
  • 04. Binary Search Runtime
  • 05. Resources
  • 06. Quiz:- Linear Search and Binary Search
    • 01. Problem Overview and Naïve Solution
    • 02. Naïve Divide and Conquer Algorithm
    • 03. Faster Divide and Conquer Algorithm
    • 04. Resources
  • 07. Quiz:- Polynomial Multiplication
    • 01. What is the Master Theorem?
    • 02. Proof of the Master Theorem
    • 03. Resources
  • 08.Quiz:- Master Theorem
    • 01. Problem Overview
    • 02. Selection Sort
    • 03. Merge Sort
    • 04. Lower Bound for Comparison Based Sorting
    • 05. Non-Comparison Based Sorting Algorithms
    • 06. Resources
  • 09. Quiz:- Sorting
    • 01. Overview
    • 02. Algorithm
    • 03. Random Pivot
    • 04. Running Time Analysis (optional)
    • 05. Equal Elements
    • 06. Final Remarks
    • 07. Resources
  • 09. Quiz:- Quick Sort
  • 10. Programming Assignment 3: Divide and Conquer
WEEK 5

Dynamic Programming

In this final module of the course you will learn about the powerful algorithmic technique for solving many optimization problems called Dynamic Programming. It turned out that dynamic programming can solve many problems that evade all attempts to solve them using greedy or divide-and-conquer strategy. There are countless applications of dynamic programming in practice: from maximizing the advertisement revenue of a TV station, to search for similar Internet pages, to gene finding (the problem where biologists need to find the minimum number of mutations to transform one gene into another). You will learn how the same idea helps to automatically make spelling corrections and to show the differences between two versions of the same text.

  • 01. Change Problem
  • 02. Quiz:- Change Money
    • 01. Resources
    • 02. The Alignment Game
    • 03. Computing Edit Distance
    • 04. Reconstructing an Optimal Alignment
  • 03. Quiz:- Edit Distance
    • 01. Resources
    • 02. Problem Overview
  • 04. Quiz:- Knapsack
    • 01. Knapsack with Repetitions
    • 02. Knapsack without Repetitions
    • 03. Final Remarks
    • 04. Resources
    • 05. Problem Overview
  • 05. Quiz:- Maximum Value of an Arithmetic Expression
    • 01. Subproblems
    • 02. Algorithm
    • 03. Reconstructing a Solution
  • 06. Programming Assignment 4: Dynamic Programming

More Detail

2.

COURSE 2

Data Structures

About the Course

A good algorithm usually comes together with a set of good data structures that allow the algorithm to manipulate the data efficiently. In this course, we consider the common data structures that are used in various computational problems. You will learn how these data structures are implemented in different programming languages and will practice implementing them in our programming assignments. This will help you to understand what is going on inside a particular built-in implementation of a data structure and what to expect from it. You will also learn typical use cases for these data structures. A few examples of questions that we are going to cover in this class are the following: 1. What is a good strategy of resizing a dynamic array? 2. How priority queues are implemented in C++, Java, and Python? 3. How to implement a hash table so that the amortized running time of all operations is O(1) on average? 4. What are good strategies to keep a binary tree balanced? You will also learn how services like Dropbox manage to upload some large files instantly and to save a lot of storage space!

WEEK 1

Basic Data Structures

In this module, you will learn about the basic data structures used throughout the rest of this course. We start this module by looking in detail at the fundamental building blocks: arrays and linked lists. From there, we build up two important data structures: stacks and queues. Next, we look at trees: examples of how they’re used in Computer Science, how they’re implemented, and the various ways they can be traversed. Once you’ve completed this module, you will be able to implement any of these data structures, as well as have a solid understanding of the costs of the operations, as well as the tradeoffs involved in using each data structure.

  • 01. Welcome!
  • 02. Arrays
  • 03. Singly-Linked Lists
  • 04. Doubly-Linked Lists
  • 05. Slides and External References
  • 06. Stacks
  • 07. Queues
  • 08. Slides and External References
  • 09. Trees
  • 10. Tree Traversal
  • 11. Slides and External References
  • 12. Practice Quiz:- Basic Data Structures
    • 01. Available Programming Languages
    • 02. FAQ on Programming Assignments
  • 13. Programming Assignment 1: Basic Data Structures
    • 01. Acknowledgements
WEEK 2

Dynamic Arrays and Amortized Analysis

In this module, we discuss Dynamic Arrays: a way of using arrays when it is unknown ahead-of-time how many elements will be needed. Here, we also discuss amortized analysis: a method of determining the amortized cost of an operation over a sequence of operations. Amortized analysis is very often used to analyse performance of algorithms when the straightforward analysis produces unsatisfactory results, but amortized analysis helps to show that the algorithm is actually efficient. It is used both for Dynamic Arrays analysis and will also be used in the end of this course to analyze Splay trees.

  • 01. Dynamic Arrays
  • 02. Amortized Analysis: Aggregate Method
  • 03. Amortized Analysis: Banker’s Method
  • 04. Amortized Analysis: Physicist’s Method
  • 05. Amortized Analysis: Summary
  • 06. Quiz:- Dynamic Arrays and Amortized Analysis
    • 01. Slides and External References
WEEK 3

Priority Queues and Disjoint Sets

We start this module by considering priority queues which are used to efficiently schedule jobs, either in the context of a computer operating system or in real life, to sort huge files, which is the most important building block for any Big Data processing algorithm, and to efficiently compute shortest paths in graphs, which is a topic we will cover in our next course. For this reason, priority queues have built-in implementations in many programming languages, including C++, Java, and Python. We will see that these implementations are based on a beautiful idea of storing a complete binary tree in an array that allows to implement all priority queue methods in just few lines of code. We will then switch to disjoint sets data structure that is used, for example, in dynamic graph connectivity and image processing. We will see again how simple and natural ideas lead to an implementation that is both easy to code and very efficient. By completing this module, you will be able to implement both these data structures efficiently from scratch.

  • 01. Introduction
  • 02. Naive Implementations
  • 03. Slides
  • 04. Binary Trees
  • 05. Tree Height Remark
  • 06. Basic Operations
  • 07. Complete Binary Trees
  • 08. Pseudocode
  • 09. Slides and External References
  • 10. Heap Sort
  • 11. Building a Heap
  • 12. Final Remarks
  • 13. Quiz:- Priority Queues: Quiz
    • 01. Slides and External References
    • 02. Overview
    • 03. Naive Implementations
    • 04. Slides and External References
    • 05. Trees
    • 06. Union by Rank
    • 07. Path Compression
    • 08. Analysis (Optional)
  • 15. Quiz: Disjoint Sets
    • 01. Slides and External References
  • 16. Practice Quiz:- Priority Queues and Disjoint Sets
  • 17. Programming Assignment 2: Priority Queues and Disjoint Sets
WEEK 4

Hash Tables

In this module you will learn about very powerful and widely used technique called hashing. Its applications include implementation of programming languages, file systems, pattern search, distributed key-value storage and many more. You will learn how to implement data structures to store and modify sets of objects and mappings from one type of objects to another one. You will see that naive implementations either consume huge amount of memory or are slow, and then you will learn to implement hash tables that use linear memory and work in O(1) on average! In the end, you will learn how hash functions are used in modern disrtibuted systems and how they are used to optimize storage of services like Dropbox, Google Drive and Yandex Disk!

  • 01. Applications of Hashing
  • 02. Analysing Service Access Logs
  • 03. Direct Addressing
  • 04. List-based Mapping
  • 05. Hash Functions
  • 06. Chaining Scheme
  • 07. Chaining Implementation and Analysis
  • 08. Hash Tables
  • 09. Slides and External References
  • 10. Phone Book Problem
  • 11. Phone Book Problem – Continued
  • 12. Universal Family
  • 13. Hashing Integers
  • 14. Proof: Upper Bound for Chain Length (Optional)
  • 15. Proof: Universal Family for Integers (Optional)
  • 16. Hashing Strings
  • 17. Hashing Strings – Cardinality Fix
  • 18. Slides and External References
  • 19. Quiz:- Hash Tables and Hash Functions
    • 01. Search Pattern in Text
    • 02. Rabin-Karp’s Algorithm
    • 03. Optimization: Precomputation
    • 04. Optimization: Implementation and Analysis
    • 05. Slides and External References
    • 06. Instant Uploads and Storage Optimization in Dropbox
    • 07. Distributed Hash Tables
    • 08. Slides and External References
  • 19. Practice Quiz:- Hashing
  • 20. Programming Assignment 3: Hash Tables
WEEK 5

Binary Search Trees

In this module we study binary search trees, which are a data structure for doing searches on dynamically changing ordered sets. You will learn about many of the difficulties in accomplishing this task and the ways in which we can overcome them. In order to do this you will need to learn the basic structure of binary search trees, how to insert and delete without destroying this structure, and how to ensure that the tree remains balanced.

  • 01. Introduction
  • 02. Search Trees
  • 03. Basic Operations
  • 04. Balance
  • 05. Slides and External References
  • 06. AVL Trees
  • 07. AVL Tree Implementation
  • 08. Split and Merge
  • 09. Slides and External References
  • 10. Practice Quiz:- Binary Search Trees
WEEK 6

Binary Search Trees 2

In this module we continue studying binary search trees. We study a few non-trivial applications. We then study the new kind of balanced search trees – Splay Trees. They adapt to the queries dynamically and are optimal in many ways.

  • 01. Applications
  • 02. Slides and External References
  • 03. Splay Trees: Introduction
  • 04. Splay Trees: Implementation
  • 05. (Optional) Splay Trees: Analysis
  • 06. Slides and External References
  • 07. Practice Quiz:- Splay Trees
  • 08. Programming Assignment 4: Binary Search Trees

More Detail

3.

COURSE 3

Algorithms on Graphs

About the Course

If you have ever used a navigation service to find optimal route and estimate time to destination, you’ve used algorithms on graphs. Graphs arise in various real-world situations as there are road networks, computer networks and, most recently, social networks! If you’re looking for the fastest time to get to work, cheapest way to connect set of computers into a network or efficient algorithm to automatically find communities and opinion leaders in Facebook, you’re going to work with graphs and algorithms on graphs. In this course, you will first learn what a graph is and what are some of the most important properties. Then you’ll learn several ways to traverse graphs and how you can do useful things while traversing the graph in some order. We will then talk about shortest paths algorithms — from the basic ones to those which open door for 1000000 times faster algorithms used in Google Maps and other navigational services. You will use these algorithms if you choose to work on our Fast Shortest Routes industrial capstone project. We will finish with minimum spanning trees which are used to plan road, telephone and computer networks and also find applications in clustering and approximate algorithms.

WEEK 1

Decomposition of Graphs 1

Graphs arise in various real-world situations as there are road networks, computer networks and, most recently, social networks! If you’re looking for the fastest time to get to work, cheapest way to connect set of computers into a network or efficient algorithm to automatically find communities and opinion leaders hot in Facebook, you’re going to work with graphs and algorithms on graphs. In this module, you will learn ways to represent a graph as well as basic algorithms for decomposing graphs into parts. In the programming assignment of this module, you will apply the algorithms that you’ve learned to implement efficient programs for exploring mazes, analyzing Computer Science curriculum, and analyzing road networks. In the first week of the module, we focus on undirected graphs.

  • 01. Welcome!
  • 02. Graph Basics
  • 03. Representing Graphs
  • 04. Slides and External References
  • 05. Exploring Graphs
  • 06. Connectivity
  • 07. Previsit and Postvisit Orderings
  • 08. Slides and External References
  • 09. Programming Assignment 1: Decomposition of Graphs
WEEK 2

Decomposition of Graphs 2

This week we continue to study graph decomposition algorithms, but now for directed graphs.

  • 01. Directed Acyclic Graphs
  • 02. Topological Sort
  • 03. Strongly Connected Components
  • 04. Computing Strongly Connected Components
  • 05. Slides and External References
  • 06. Programming Assignment 2: Decomposition of Graphs
WEEK 3

Paths in Graphs 1

In this module you will study algorithms for finding Shortest Paths in Graphs. These algorithms have lots of applications. When you launch a navigation app on your smartphone like Google Maps or Yandex.Navi, it uses these algorithms to find you the fastest route from work to home, from home to school, etc. When you search for airplane tickets, these algorithms are used to find a route with the minimum number of plane changes. Unexpectedly, these algorithms can also be used to determine the optimal way to do currency exchange, sometimes allowing to earh huge profit! We will cover all these applications, and you will learn Breadth-First Search, Dijkstra’s Algorithm and Bellman-Ford Algorithm. These algorithms are efficient and lay the foundation for even more efficient algorithms which you will learn and implement in the Shortest Paths Capstone Project to find best routes on real maps of cities and countries, find distances between people in Social Networks. In the end you will be able to find Shortest Paths efficiently in any Graph. This week we will study Breadth-First Search algorithm.

  • 01. Most Direct Route
  • 02. Breadth-First Search
  • 03. Breadth-First Search (continued)
  • 04. Implementation and Analysis
  • 05. Proof of Correctness
  • 06. Proof of Correctness (continued)
  • 07. Shortest-Path Tree
  • 08. Reconstructing the Shortest Path
  • 09. Slides and External References
  • 10. Programming Assignment 3: Paths in Graphs
WEEK 4

Paths in Graphs 2

This week we continue to study Shortest Paths in Graphs. You will learn Dijkstra’s Algorithm which can be applied to find the shortest route home from work. You will also learn Bellman-Ford’s algorithm which can unexpectedly be applied to choose the optimal way of exchanging currencies. By the end you will be able to find shortest paths efficiently in any Graph.

  • 01. Fastest Route
  • 02. Naive Algorithm
  • 03. Dijkstra’s Algorithm: Intuition and Example
  • 04. Dijkstra’s Algorithm: Implementation
  • 05. Dijkstra’s Algorithm: Proof of Correctness
  • 06. Dijkstra’s Algorithm: Running Time
  • 07. Slides and External References
  • 08. Currency Exchange
  • 09. Currency Exchange: Reduction to Shortest Paths
  • 10. Bellman-Ford Algorithm
  • 11. Bellman-Ford Algorithm: Proof of Correctness
  • 12. Negative Cycles
  • 13. Infinite Arbitrage
  • 14. Slides and External References
  • 15. Programming Assignment 4: Paths in Graphs
WEEK 5

Minimum Spanning Trees

In this module, we study the minimum spanning tree problem. We will cover two elegant greedy algorithms for this problem: the first one is due to Kruskal and uses the disjoint sets data structure, the second one is due to Prim and uses the priority queue data structure. In the programming assignment for this module you will be computing an optimal way of building roads between cities and an optimal way of partitioning a given set of objects into clusters (a fundamental problem in data mining).

  • 01. Building a Network
  • 02. Greedy Algorithms
  • 03. Cut Property
  • 04. Kruskal’s Algorithm
  • 05. Prim’s Algorithm
  • 06. Slides and External References
  • 07. Programming Assignment 5: Minimum Spanning Trees
WEEK 6

Advanced Shortest Paths Project (Optional)

In this module, you will learn Advanced Shortest Paths algorithms that work in practice 1000s (up to 25000) of times faster than the classical Dijkstra’s algorithm on real-world road networks and social networks graphs. You will work on a Programming Project based on these algorithms. You will find the shortest paths on the real maps of parts of US and the shortest paths connecting people in the social networks. We encourage you not only to use the ideas from this module’s lectures in your implementations, but also to come up with your own ideas for speeding up the algorithm! We encourage you to compete on the forums to see whose implementation is the fastest one 🙂

  • 01. Programming Project: Introduction
  • 02. Bidirectional Search
  • 03. Six Handshakes
  • 04. Bidirectional Dijkstra
  • 05. Finding Shortest Path after Meeting in the Middle
  • 06. Computing the Distance
  • 07. Slides and External References
  • 08. A* Algorithm
  • 09. Performance of A*
  • 10. Bidirectional A*
  • 11. Potential Functions and Lower Bounds
  • 13. Landmarks (Optional)
  • 14. Slides and External References
  • 15. Highway Hierarchies and Node Importance
  • 16. Preprocessing
  • 17. Witness Search
  • 18. Query
  • 19. Proof of Correctness
  • 20. Node Ordering
  • 21. Slides and External Refernces
  • 21. Practice Quiz:- Bidirectional Dijkstra, A* and Contraction Hierarchies
  • 22. Practice Programming Assignment:- Advanced Shortest Paths

More Detail

4.

COURSE 4

Algorithms on Strings

About the Course

World and internet is full of textual information. We search for information using textual queries, we read websites, books, e-mails. All those are strings from the point of view of computer science. To make sense of all that information and make search efficient, search engines use many string algorithms. Moreover, the emerging field of personalized medicine uses many search algorithms to find disease-causing mutations in the human genome.

WEEK 1

Suffix Trees

How would you search for a longest repeat in a string in LINEAR time? In 1973, Peter Weiner came up with a surprising solution that was based on suffix trees, the key data structure in pattern matching. Computer scientists were so impressed with his algorithm that they called it the Algorithm of the Year. In this lesson, we will explore some key ideas for pattern matching that will – through a series of trials and errors – bring us to suffix trees.

  • 01. From Genome Sequencing to Pattern Matching
  • 02. Brute Force Approach to Pattern Matching
  • 03. Herding Patterns into Trie
  • 04. Trie Construction – Pseudocode
  • 05. Herding Text into Suffix Trie
  • 06. Suffix Trees
  • 07. FAQ
  • 08. Slides and External References
  • 09. Available Programming Languages
  • 10. FAQ on Programming Assignments
  • 11. Programming Assignment 1
WEEK 2

Burrows-Wheeler Transform and Suffix Arrays

Although EXACT pattern matching with suffix trees is fast, it is not clear how to use suffix trees for APPROXIMATE pattern matching. In 1994, Michael Burrows and David Wheeler invented an ingenious algorithm for text compression that is now known as Burrows-Wheeler Transform. They knew nothing about genomics, and they could not have imagined that 15 years later their algorithm will become the workhorse of biologists searching for genomic mutations. But what text compression has to do with pattern matching??? In this lesson you will learn that the fate of an algorithm is often hard to predict – its applications may appear in a field that has nothing to do with the original plan of its inventors.

  • 01. Burrows-Wheeler Transform
  • 02. Inverting Burrows-Wheeler Transform
  • 03. Using BWT for Pattern Matching
  • 04. Using BWT for Pattern Matching
  • 05. Suffix Arrays
  • 06. Pattern Matching with Suffix Array
  • 07. Approximate Pattern Matching
  • 08. FAQ
  • 09. Slides and External References
  • 10. Programming Assignment:- Programming Assignment 2
WEEK 3

Knuth–Morris–Pratt Algorithm

Congratulations, you have now learned the key pattern matching concepts: tries, suffix trees, suffix arrays and even the Burrows-Wheeler transform! However, some of the results Pavel mentioned remain mysterious: e.g., how can we perform exact pattern matching in O(|Text|) time rather than in O(|Text|*|Pattern|) time as in the naïve brute force algorithm? How can it be that matching a 1000-nucleotide pattern against the human genome is nearly as fast as matching a 3-nucleotide pattern??? Also, even though Pavel showed how to quickly construct the suffix array given the suffix tree, he has not revealed the magic behind the fast algorithms for the suffix tree construction!In this module, Miсhael will address some algorithmic challenges that Pavel tried to hide from you 🙂 such as the Knuth-Morris-Pratt algorithm for exact pattern matching and more efficient algorithms for suffix tree and suffix array construction.

  • 01. Exact Pattern Matching
  • 02. Safe Shift
  • 03. Prefix Function
  • 04. Computing Prefix Function
  • 05. Knuth-Morris-Pratt Algorithm
  • 06. Quiz:- Exact Pattern Matching
    • 01. Programming Assignment 3 lasts for two weeks
    • 02. Slides and External References
WEEK 4

Constructing Suffix Arrays and Suffix Trees

In this module we continue studying algorithmic challenges of the string algorithms. You will learn an O(n log n) algorithm for suffix array construction and a linear time algorithm for construction of suffix tree from a suffix array. You will also implement these algorithms and the Knuth-Morris-Pratt algorithm in the last Programming Assignment in this course.

  • 01. Suffix Array
  • 02. General Strategy
  • 03. Initialization
  • 04. Counting Sort
  • 05. Sort Doubled Cyclic Shifts
  • 06. SortDouble Implementation
  • 07. Updating Classes
  • 08. Full Algorithm
  • 09. Slides and External References
  • 10. Quiz:- Suffix Array Construction
    • 01. Suffix Array and Suffix Tree
    • 02. LCP Array
    • 03. Computing the LCP Array
    • 04. Computing the LCP Array – Additional Slides
    • 05. Construct Suffix Tree from Suffix Array and LCP Array
    • 06. Suffix Tree Construction – Pseudocode
    • 07. Slides and External References
  • 11. Programming Assignment • Programming Assignment 3

More Detail

5.

COURSE 5

Advanced Algorithms and Complexity

About the Course

You’ve learned the basic algorithms now and are ready to step into the area of more complex problems and algorithms to solve them. Advanced algorithms build upon basic ones and use new ideas. We will start with networks flows which are used in more typical applications such as optimal matchings, finding disjoint paths and flight scheduling as well as more surprising ones like image segmentation in computer vision. We then proceed to linear programming with applications in optimizing budget allocation, portfolio optimization, finding the cheapest diet satisfying all requirements and many others. Next we discuss inherently hard problems for which no exact good solutions are known (and not likely to be found) and how to solve them in practice. We finish with a soft introduction to streaming algorithms that are heavily used in Big Data processing. Such algorithms are usually designed to be able to process huge datasets without being able even to store a dataset.

WEEK 1

Flows in Networks

Network flows show up in many real world situations in which a good needs to be transported across a network with limited capacity. You can see it when shipping goods across highways and routing packets across the internet. In this unit, we will discuss the mathematical underpinnings of network flows and some important flow algorithms. We will also give some surprising examples on seemingly unrelated problems that can be solved with our knowledge of network flows.

  • 01. Reading:-Slides and Resources on Flows in Networks
  • 02. Introduction
  • 03. Network Flows
  • 04. Residual Networks
  • 05. Maxflow-Mincut
  • 06. The Ford–Fulkerson Algorithm
  • 07. Slow Example
  • 08. The Edmonds–Karp Algorithm
  • 09. Bipartite Matching
  • 10. Image Segmentation
  • 11. Quiz:- Flow Algorithms
    • 01. Available Programming Languages
    • 02. FAQ on Programming Assignments
  • 12. Programming Assignment:- Programming Assignment 1
WEEK 2

Linear Programming

Linear programming is a very powerful algorithmic tool. Essentially, a linear programming problem asks you to optimize a linear function of real variables constrained by some system of linear inequalities. This is an extremely versatile framework that immediately generalizes flow problems, but can also be used to discuss a wide variety of other problems from optimizing production procedures to finding the cheapest way to attain a healthy diet. Surprisingly, this very general framework admits efficient algorithms. In this unit, we will discuss some of the importance of linear programming problems along with some of the tools used to solve them.
Reading • Slides and Resources on Linear Programming

  • 01. Introduction
  • 02. Linear Programming
  • 03. Linear Algebra: Method of Substitution
  • 04. Linear Algebra: Gaussian Elimination
  • 05. Convexity
  • 06. Duality
  • 07. (Optional) Duality Proofs
  • 08. Linear Programming Formulations
  • 09. The Simplex Algorithm
  • 10. (Optional) The Ellipsoid Algorithm
  • 11. Quiz:- Linear Programming Quiz
    • 01. Programming Assignment:- Programming Assignment 2
  • 12. Programming Assignment:- Programming Assignment 1
WEEK 3

NP-complete Problems

Although many of the algorithms you’ve learned so far are applied in practice a lot, it turns out that the world is dominated by real-world problems without a known provably efficient algorithm. Many of these problems can be reduced to one of the classical problems called NP-complete problems which either cannot be solved by a polynomial algorithm or solving any one of them would win you a million dollars (see Millenium Prize Problems) and eternal worldwide fame for solving the main problem of computer science called P vs NP. It’s good to know this before trying to solve a problem before the tomorrow’s deadline 🙂 Although these problems are very unlikely to be solvable efficiently in the nearest future, people always come up with various workarounds. In this module you will study the classical NP-complete problems and the reductions between them. You will also practice solving large instances of some of these problems despite their hardness using very efficient specialized software based on tons of research in the area of NP-complete problems.

  • 01. Reading:- Slides and Resources on NP-complete Problems
  • 02. Brute Force Search
  • 03. Search Problems
  • 04. Traveling Salesman Problem
  • 05. Hamiltonian Cycle Problem
  • 06. Longest Path Problem
  • 07. Integer Linear Programming Problem
  • 08. Independent Set Problem
  • 09. P and NP
  • 10. Reductions
  • 11. Showing NP-completeness
  • 12. 3-SAT to Independent Set
  • 13. SAT to 3-SAT
  • 14. Circuit SAT to SAT
  • 15. All of NP to Circuit SAT
  • 16. Using SAT-solvers
  • 17. Quiz:- NP-complete Problems
  • 18. Programming Assignment:- Programming Assignment 3
WEEK 4

Coping with NP-completeness

After the previous module you might be sad: you’ve just went through 5 courses in Algorithms only to learn that they are not suitable for most real-world problems. However, don’t give up yet! People are creative, and they need to solve these problems anyway, so in practice there are often ways to cope with an NP-complete problem at hand. We first show that some special cases on NP-complete problems can, in fact, be solved in polynomial time. We then consider exact algorithms that find a solution much faster than the brute force algorithm. We conclude with approximation algorithms that work in polynomial time and find a solution that is close to being optimal.

  • 01. Reading:- Slides and Resources on Coping with NP-completeness
  • 02. Introduction
  • 03. 2-SAT
  • 04. 2-SAT: Algorithm
  • 05. Independent Sets in Trees
  • 06. 3-SAT: Backtracking
  • 07. 3-SAT: Local Search
  • 08. TSP: Dynamic Programming
  • 09. TSP: Branch and Bound
  • 10. Vertex Cover
  • 11. Metric TSP
  • 12. TSP: Local Search
  • 13. Quiz:- Coping with NP-completeness
  • 14. Programming Assignment:- Programming Assignment 4
WEEK 5

Streaming Algorithms (Optional)

In most previous lectures we were interested in designing algorithms with fast (e.g. small polynomial) runtime, and assumed that the algorithm has random access to its input, which is loaded into memory. In many modern applications in big data analysis, however, the input is so large that it cannot be stored in memory. Instead, the input is presented as a stream of updates, which the algorithm scans while maintaining a small summary of the stream seen so far. This is precisely the setting of the streaming model of computation, which we study in this lecture. The streaming model is well-suited for designing and reasoning about small space algorithms. It has received a lot of attention in the literature, and several powerful algorithmic primitives for computing basic stream statistics in this model have been designed, several of them impacting the practice of big data analysis. In this lecture we will see one such algorithm (CountSketch), a small space algorithm for finding the top k most frequent items in a data stream.

  • 01. Introduction
  • 02. Heavy Hitters Problem
  • 03. Reduction 1
  • 04. Reduction 2
  • 05. Basic Estimate 1
  • 06. Basic Estimate 2
  • 07. Final Algorithm 1
  • 08. Final Algorithm 2
  • 09. Proofs 1
  • 10. Proofs 2
  • 11. Practice Quiz • Quiz: Heavy Hitters

More Detail

6.

COURSE 6

Genome Assembly Programming Challenge

About the Capstone Project

In Spring 2011, thousands of people in Germany were hospitalized with a deadly disease that started as food poisoning with bloody diarrhea and often led to kidney failure. It was the beginning of the deadliest outbreak in recent history, caused by a mysterious bacterial strain that we will refer to as E. coli X. Soon, German officials linked the outbreak to a restaurant in Lübeck, where nearly 20% of the patrons had developed bloody diarrhea in a single week. At this point, biologists knew that they were facing a previously unknown pathogen and that traditional methods would not suffice – computational biologists would be needed to assemble and analyze the genome of the newly emerged pathogen. To investigate the evolutionary origin and pathogenic potential of the outbreak strain, researchers started a crowdsourced research program. They released bacterial DNA sequencing data from one of a patient, which elicited a burst of analyses carried out by computational biologists on four continents.

They even used GitHub for the project: https://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/wiki The 2011 German outbreak represented an early example of epidemiologists collaborating with computational biologists to stop an outbreak. In this Genome Assembly Programming Challenge, you will follow in the footsteps of the bioinformaticians investigating the outbreak by developing a program to assemble the genome of the E. coli X from millions of overlapping substrings of the E.coli X genome.

WEEK 1

The 2011 European E. coli Outbreak

In April 2011, hundreds of people in Germany were hospitalized with a deadly disease that often started as food poisoning with bloody diarrhea. It was the beginning of the deadliest outbreak in recent history, caused by a mysterious bacterial strain that we will refer to as E. coli X. Within a few months, the outbreak had infected thousands and killed 53 people. To prevent the further spread of the outbreak, computational biologists all over the world had to answer the question “What is the genome sequence of E. coli X?” in order to figure out what new genes it acquired to become pathogenic. The 2011 German outbreak represented an early example of epidemiologists collaborating with computational biologists to stop an outbreak. In this Genome Assembly Programming Challenge, you will follow in the footsteps of the bioinformaticians investigating the outbreak by developing a program to assemble the genome of the deadly E. coli X strain. However, before you embark on building a program for assembling the E. coli X strain, we have to explain some genomic concepts and warm you up by having you solve a simpler problem of assembling a small virus.

  • 01. 2011 European E. coli outbreak
  • 02. Assembling phage genome
  • 03. Project Description
  • 04. Peer Review:- What does it mean to assemble a genome?
  • 05. Peer Review:- What does it mean to assemble a genome from error-prone-reads?
  • 06. Programming Assignment 1: Assembling the phi 174X Genome Using Overlap Graphs
WEEK 2

Assembling Genomes Using de Bruijn Graphs

DNA sequencing approach that led to assembly of a small virus in 1977 went through a series of transformations that contributed to the emergence of personalized medicine a few years ago. By the late 1980s, biologists were routinely sequencing viral genomes containing hundreds of thousands of nucleotides, but the idea of sequencing a bacterial (let alone the human) genome containing millions (or even billions) of nucleotides remained preposterous and would cost billions of dollars. In 1988, three biologists (independently and simultaneously!) came up with an idea to reduce sequencing cost and proposed the futuristic and at the time completely implausible method of DNA arrays. None of these three biologists could have possibly imagined that the implications of his own experimental research would eventually bring him face-to-face with challenging algorithmic problems. In this module you will learn about the algorithmic challenge of DNA sequencing using information about short k-mers provided by DNA arrays. You will also travel to the 18the century to learn about the Bridges of Konigsberg and solve a related problem of assembling a SQTL puzzle!

  • 01. DNA arrays
  • 02. Assembling genomes from k-mers
  • 03. De Bruijn graphs
  • 04. Bridges of Königsberg and universal strings
  • 05. Euler theorem
  • 06. Assignment 2: Assembling the phi 174X Genome Using De Bruijn Graphs
WEEK 3

Genome Assembly Faces Real Sequencing Data

Our discussion of genome assembly has thus far relied upon various assumptions. In this module, we will face practical challenges introduced by quirks in modern sequencing technologies and discuss some algorithmic techniques that have been devised to address these challenges. Afterwards, you will assemble the smallest bacterial genome that lives symbiotically inside leafhoppers. Its sheltered life has allowed it to reduce its genome to only about 112,091 nucleotides and 137 genes. And afterwards, you will be ready to assemble the E. coli X genome!

  • 01. Splitting the genome into contigs
  • 02. From reads to read-pairs
  • 03. Genome assembly faces real sequencing data
  • 04. Bridges of Königsberg and universal strings
  • 05. Programming Assignment 3: Genome Assembly Faces Real Sequencing Data

More Detail

WHAT YOU GET

SQTL Learning Centre

Gain free access to a variety of supplemental resources like handouts, reference material, guides, lecture transcripts and student forums for a period of 12 months.

Faculty Support

Get your doubts solved by the SQTL Faculty via email, phone or chat

Q&A Sessions

2 hours of sessions every month, conducted by IT mentors to resolve your questions and doubts.

SQTL Lab

Access to a cloud-based solution, for hands-on experience with real-life business data using the latest tools

Career Counseling

Avail professional guidance on resume building, interview preparation and identification of relevant opportunities, for the IT field.

Placement Assistance

Help you get your dream job via industry references, interview preparations and specialized walk-in drives at SQTL Campus.

FINAL OUTCOME

After completing this specialization, you would have mastered the most sought after skill set in the field of computer science and engineering. This course will make you independent of any programming language. You can start your journey with top ranked software companies.

WHO SHOULD DO IT

This course is meant for anyone interested in a career in software industry.

If you need clarity on how online training works, please visit How It Works.

Or if you have any other questions, please visit the FAQs page.

In case you still have any unanswered questions, we encourage you to register for an upcoming webinar.