Pricing is per delegate, giving you huge savings over the cost of individual courses.
4.6 out of 5 rating Last updated 14/11/2024 English
Interested in available dates? Would like to book a private session of this course for your company? Or for any other queries please simply fill out the form below.
4 Days
24 CPD hours
Overview of data science and machine learning at scale
Overview of the Hadoop ecosystem
Working with HDFS data and Hive tables using Hue
Introduction to Cloudera Data Science Workbench
Overview of Apache Spark 2
Reading and writing data
Inspecting data quality
Cleansing and transforming data
Summarizing and grouping data
Combining, splitting, and reshaping data
Exploring data
Configuring, monitoring, and troubleshooting Spark applications
Overview of machine learning in Spark MLlib
Extracting, transforming, and selecting features
Building and evaluating regression models
Building and evaluating classification models
Building and evaluating clustering models
Cross-validating models and tuning hyperparameters
Building machine learning pipelines
Deploying machine learning models
Spark, Spark SQL, and Spark MLlib
PySpark and sparklyr
Cloudera Data Science Workbench (CDSW)
Hue
This workshop covers data science and machine learning workflows at scale using Apache Spark 2 and other key components of the Hadoop ecosystem. The workshop emphasizes the use of data science and machine learning methods to address real-world business challenges. Using scenarios and datasets from a fictional technology company, students discover insights to support critical business decisions and develop data products to transform the business. The material is presented through a sequence of brief lectures, interactive demonstrations, extensive hands-on exercises, and discussions. The Apache Spark demonstrations and exercises are conducted in Python (with PySpark) and R (with sparklyr) using the Cloudera Data Science Workbench (CDSW) environment. The workshop is designed for data scientists who currently use Python or R to work with smaller datasets on a single machine and who need to scale up their analyses and machine learning models to large datasets on distributed clusters. Data engineers and developers with some knowledge of data science and machine learning may also find this workshop useful.
Nexus Humans Cloudera Data Scientist Training training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward.
This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Cloudera Data Scientist Training course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.The 4 day. Cloudera Data Scientist Training training course give you up to 24 CPD hours/structured learning hours. If you need a letter or certificate in a particular format for your association, organisation or professional body please just ask.
The workshop is designed for data scientists who currently use Python or R to work with smaller datasets on a single machine and who need to scale up their analyses and machine learning models to large datasets on distributed clusters. Data engineers and developers with some knowledge of data science and machine learning may also find this workshop useful.
Yes we provide corporate training, dedicated training and closed classes for the Cloudera Data Scientist Training. This can take place anywhere in UK including, England, Scotland, Cymru (Wales) or Northern Ireland or live online allowing you to have your teams from across UK or further afield to attend a single training event saving travel and delivery expenses.
The Cloudera Data Scientist Training training takes place over 4 day(s), with each day lasting approximately 8 hours including small and lunch breaks to ensure that the delegates get the most out of the day.
Yes, the discount code PENPAL5 is currently available for the Cloudera Data Scientist Training training. Other discount codes may also be available but only one discount code or special offer can be used for each booking. This discount code is available for companies and individuals.
When you organise training, we understand that there is a risk that some people may fall ill, become unavailable. To mitigate the risk we include training insurance for each delegate enrolled on our public schedule, they are welcome to sit on the same Public class within 6 months at no charge, if the case arises.