[LINK] New MIT Big Data Course

stephen at melbpc.org.au stephen at melbpc.org.au
Fri Jan 10 15:19:50 AEDT 2014


MIT Professional Education

"TACKLING THE CHALLENGES OF BIG DATA"

Dates: Course Runs Online, March 4 - April 1, 2014 | Fee: USD$495

<http://web.mit.edu/professional/onlinex-
programs/courses/tackling_the_challenges_of_big_data.html>


COURSE DESCRIPTION

This new Online X course will survey state-of-the-art topics in Big Data, 
looking at data collection (smartphones, sensors, the Web), data storage 
and processing (scalable relational databases, Hadoop, Spark, etc.), 
extracting structured data from unstructured data, systems issues 
(exploiting multicore, security), analytics (machine learning, data 
compression, efficient algorithms), visualization, and a range of 
applications.

Each module will introduce broad concepts as well as provide the most 
recent developments in research.

The course will be taught by a team of world experts from MIT and the MIT 
Computer Science and Artificial Intelligence Laboratory (CSAIL) in each of 
these areas.

Registration Deadlines:

It is highly recommended that you register as soon as possible. 

Registration will be accepted up until February 28, but registrants will 
not be given access to the course site or materials until payment is 
received.

Course Flyer (pdf):

<http://web.mit.edu/professional/pdf/oxp-docs/BigDataCourseFlyer.pdf>


COURSE OVERVIEW:

The course is held over four weeks and will provide the following:

Online accessibility 24/7 – self-paced
Five modules covering 18 topic areas: with 20 hours of video
Five assessments to reinforce key learning concepts of each module
Case studies
Discussion Forums for participants to discuss thought provoking questions 
in medicine, social media, finance, and transportation posed by the MIT 
faculty teaching the course; share, engage, and ideate with other 
participants
Community Wiki for sharing additional resources, suggested readings, and 
related links

Participants will also take away:

Course materials from all presentations

30 day access to the archived course (includes videos, discussion boards, 
content, and Wiki)

KEY BENEFITS

Position yourself in your organization as a vital subject matter expert 
regarding major technologies and applications in your industry that are 
driving the Big Data revolution and position your company to propel forward 
and stay competitive

Engage confidently with management on opportunities and Big Data challenges 
faced by your industry; analyze emerging technologies and how those 
technologies can be applied effectively to address real business problems 
while unlocking the value of data and its potential use for company growth

Learn and assess the issues of scalability – make your work more productive

Gain valuable insights and access to CSAIL research that will differentiate 
how you and your company breakdown Big Data to save time and money while 
making work more efficient

Convenient, flexible schedule with access 24 hours a day, from anywhere in 
the world, no travel time, inexpensive, taught by world-renowned MIT 
faculty

MIT PROFESSIONAL EDUCATION ALUMNI BENEFITS

After completing Tackling the Challenges of Big Data, participants will 
become alumni of MIT Professional Education and will receive all the 
associated benefits and courtesies listed below.

Receive exclusive discounts on all future Short Programs and Online X 
Programs courses
Access will be provided to our restricted MIT Professional Education alumni 
group on LinkedIn; this includes invites to join all MIT Professional 
Education social media platforms
Networking opportunities with other individuals from around the globe 
working in a variety of industries interested in technology, computer 
science, entrepreneurship, science, research, and Big Data, among many 
others
Email distribution of our MIT Professional Education newsletter
Finally, participants will join the MIT Professional Education alumni 
mailing list where they will receive advanced notice regarding special 
announcements on upcoming courses, programs, and events


EARN A CERTIFICATE OF COMPLETION

Upon successful completion of the course a Certificate of Completion will 
be awarded by MIT Professional Education.

To earn a Certificate of Completion in this course, participants should 
watch all the videos, actively participate in the discussion boards, and 
complete all assessments by April 01, 2014, with an average of 80 percent 
success rate.

The Certificate of Completion will be awarded on April 02, 2014, by MIT 
Professional Education.

Grading: Grades are not awarded for this course.


WHO SHOULD PARTICIPATE

Prerequisite(s): This course is designed to be suitable for anyone with a 
bachelor’s level education in computer science.

Tackling the Challenges of Big Data is designed to be valuable to both 
individuals and companies because it provides a platform for discussion 
from numerous technical perspectives. The concepts delivered through this 
course can spark idea generation among team members and the knowledge 
gained can be applied to their company’s approach to Big Data problems and 
shape the way business operate today.

The application of the course is broad and can apply to both early career 
professionals as well as senior technical managers.

Participants will benefit the most from the concepts taught in this course 
if they have at least three years of work experience.

Participants may include:

Engineers who need to understand the new Big Data technologies and concepts 
to apply in their work
Technical managers who want to familiarize themselves with these emerging 
technologies
Entrepreneurs who would like to gain perspective on trends and future 
capabilities of Big Data technology

LEARNING OBJECTIVES

Participants will learn the state-of-the-art in Big Data. The course aims 
to reduce the time from research to industry dissemination and expose 
participants to some of the most recent ideas and techniques in Big Data.

After taking this course, participants will:

Distinguish what is Big Data (volume, velocity, variety), and will learn 
where it comes from, and what are the key challenges
Determine how and where Big Data challenges arise in a number of domains, 
including social media, transportation, finance, and medicine
Investigate multicore challenges and how to engineer around them
Explore the relational model, SQL, and capabilities of new relational 
systems in terms of scalability and performance
Understand the capabilities of NoSQL systems, their capabilities and 
pitfalls, and how the NewSQL movement addresses these issues
Learn how to maximize the MapReduce programming model: What are its 
benefits, how it compares to relational systems, and new developments that 
improve its performance and robustness
Learn why building secure Big Data systems is so hard and survey recent 
techniques that help; including learning direct processing on encrypted 
data, information flow control, auditing, and replay
Discover user interfaces for Big Data and what makes building them 
difficult
Measure the need for and understand how to create sublinear time algorithms
Manage the development of data compression algorithms
Formulate the “data integration problem”: semantic and schematic 
heterogeneity and discuss recent breakthroughs in solving this problem
Understand the benefits and challenges of open-linked data
Comprehend machine learning and algorithms for data analytics


COURSE OUTLINE

Modules, Topics, and Faculty


MODULE ONE: INTRODUCTION AND USE CASES

The introductory module aims to give a broad survey of Big Data challenges 
and opportunities and highlights applications as case studies.

Introduction: Big Data Challenges (Sam Madden)

Identify and understand the application of existing tools and new 
technologies needed to solve next generation data challenges
Challenges posed by the ability to scale and the constraints of today's 
computing platforms and algorithms
Addressing the universal issue of Big Data and how to use the data to align 
with a company’s mission and goals
Case Study: Transportation (Daniela Rus)

Data driven models for transportation
Coresets for Global Positioning System (GPS) data streams
Congestion aware planning
Case Study: Visualizing Twitter (Sam Madden)

Understand the power of geocoded Twitter data
Learn how Graphic Processing Units (GPUs) can be used for extremely high 
throughput data processing
Utilize MapD, a new GPU based database system for visualizing Twitter in 
action

MODULE TWO: BIG DATA COLLECTION

The data capture module surveys approaches to data collection, cleaning, 
and integration.

Data Cleaning and Integration (Michael Stonebraker)

Available tools and protocols for performing data integration
Curation issues (cleaning, transforming, and consolidating data)
Hosted Data Platforms and the Cloud (Matei Zaharia)

How performance, scalability, and cost models are impacted by hosted data 
platforms in the cloud
Internal and external platforms to store data

MODULE THREE: BIG DATA STORAGE
The module on Big Data storage describes modern approaches to databases and 
computing platforms.

Modern Databases (Michael Stonebraker)

Survey data management solutions in today’s market place, including 
traditional RDBMS, NoSQL, NewSQL, and Hadoop
Strategic aspects of database management
Distributed Computing Platforms (Matei Zaharia)

Parallel computing systems that enable distributed data processing on 
clusters, including MapReduce, Dryad, Spark
Programming models for batch, interactive, and streaming applications
Tradeoffs between programming models
NoSQL, NewSQL (Sam Madden)

Survey of new emerging database and storage systems for Big Data
Tradeoffs between reduced consistency, performance, and availability
Understanding how to rethink the design of database systems can lead to 
order of magnitude performance improvements

MODULE FOUR: BIG DATA SYSTEMS

The systems module discusses solutions to creating and deploying working 
Big Data systems and applications.

Multicore Scalability (Nickolai Zeldovich)

Understanding what affects the scalability of concurrent programs on 
multicore systems
Lock-free synchronization for data structures in cache-coherent shared 
memory
Security (Nickolai Zeldovich)

Protecting confidential data in a large database using encryption
Techniques for executing database queries over encrypted data without 
decryption
User Interfaces for Data (David Karger)

Principles of and tools for data visualization and exploratory data 
analysis
Research in data-oriented user interfaces

MODULE FIVE: BIG DATA ANALYTICS

The analytics module covers state-of-the-art algorithms for very large data 
sets and streaming computation.

Machine Learning Tools (Tommi Jaakkola)

Computational capabilities of the latest advances in machine learning
Advanced machine learning algorithms and techniques for application to 
large data sets
Fast Algorithms I (Ronitt Rubinfeld)

Efficiency in data analysis
Fast Algorithms II (Piotr Indyk)

Advanced applications of efficient algorithms
Scale-up properties
Data Compression (Daniela Rus)

Reducing the size of the Big Data file and its impact on storage and 
transmission capacity
Design of data compression schemes such as coresets to apply to Big Data
Case Study: Information Summarization (Regina Barzilay)

Applications: Medicine (John Guttag)

Utilize data to improve operational efficiency and reduce costs
Analytics and tools to improve patient care and control risks
Using Big Data to improve hospital performance and equipment management
Applications: Finance (Andrew Lo)


COURSE VISION

MIT wants to help solve the world’s biggest and most important problems 
such as Big Data. Tackling the Challenges of Big Data is an online course 
developed by the MIT Computer Science and Artificial Intelligence 
Laboratory in collaboration with MIT Professional Education, and edX.

MIT Professional Education	

For 65 years MIT Professional Education has been providing a gateway to 
renowned MIT research, knowledge, and expertise for those engaged in 
science and technology worldwide, through advanced education programs 
designed for working professionals. Read more

CSAIL	

Computer Science and Artificial Intelligence Laboratory (CSAIL) 
The Computer Science and Artificial Intelligence Laboratory – known as 
CSAIL – is the largest research laboratory at MIT and one of the world’s 
most important centers of information technology research. Read more

edX

Open edX is the opensource educational platform developed by edX and its 
open source partners, including leading institutions. It powers the edX.org 
destination site and research initiatives. Read more

LOCATION

This course takes place online. We can also offer this course for large 
groups of employees from the same organization online. Please contact MIT 
Professional Education (onlinex at mit.edu) to discuss your training and 
education needs.


CSAIL is the largest research laboratory at MIT and one of the world’s most 
important centers of information technology research. CSAIL and its members 
have played a key role in the computer revolution. The lab’s researchers 
have been key movers in developments like time-sharing, massively parallel 
computers, public key encryption, the mass commercialization of robots, and 
much of the technology underlying the ARPANet, Internet, and the World Wide 
Web.

CSAIL members (former and current) have launched more than 100 companies, 
including RSA Data Security, Akamai, iRobot, Meraki, ITA Software, and 
Vertica. The Lab is home to the World Wide Web Consortium (W3C).

With backgrounds in data, programming, finance, multicore technology, 
database systems, robotics, transportation, hardware, and operating 
systems, each MIT Tackling the Challenges of Big Data professor brings 
their own unique experience and expertise to the course.

Download Course Flyer (pdf):

<http://web.mit.edu/professional/pdf/oxp-docs/BigDataCourseFlyer.pdf>

Message sent using MelbPC WebMail Server






More information about the Link mailing list