Mcb777 Bet

Machibet777 Cricket

Stéphane Maarek — Mon, 08 Jun 2020 12:43:59 GMT

FREE for 6 days — AWS Certified Cloud Practitioner course

Hello!

Just a quick flash announcement that my new AWS Certified Cloud Practitioner course is going to be free until coupons run out!

On top of things, I want to encourage you to start learning TODAY, so I am creating a Learning Contest during which 5 lucky people amongst those who finished the course at 100% before June 30th will get their certification costs paid for by me!

Read on…!

This foundational AWS certification is PERFECT if you want to start learning AWS:

No catch, pure learning: over 10 hours of videos, 350 slides, 180 lectures, 141 quiz questions, and one full practice exam.

Why am I doing this?

I want to give back to the community that has given me so much.

Join me and 350,000+ fellow learners on the AWS journey!

Please don’t enroll in the course if you don’t intend to start it — because space is limited, I want as many people as possible to benefit from this release! Thank you :)

And if you already know AWS… forward this email to a colleague or a friend who may need it!

COURSE — PRACTICE EXAMS — LEARNING CONTEST

1) How to get the course:

Link with free coupon: Click here
Paid coupon: (if you prefer paying — thank you ✌️)

Remember, if you enroll in the course for free, please start it this week :)

2) Extra practice exams:

One practice exam is included in the course, but I recommend also getting my extra practice exams to help you pass:

Paid Practice Exams:

3) Learning Contest rules:

What’s a Learning Contest?

I will pay for the certification costs for FIVE lucky learners! It’s my way of encouraging you to learn AWS, complete this course, and pass your certification. Are you in?

The GRAND prize for FIVE lucky learners:

I will pay for your AWS certification cost (value $100 USD)

The Rules:

Finish viewing the course at 100% (all lectures, quizzes) by June 30th, 11:59 PM PST
Your entry is automatically registered as soon as you finish the course (I’m tracking the progress of each student)

Who:

The winners will be selected at random amongst those who have finished the course
The winners will be announced via Educational Announcements
I will also reach out to the winners personally through personal messages on Udemy

Only 3 weeks until the end of the contest, start learning AWS today and challenge yourself!

I hope you’re excited, I know I am.

Start learning AWS today, and I will see you in the course!

Best of luck for the learning contest, and happy learning :)

Stephane Maarek

If you liked this article, don’t forget to clap and share!

Machibet Affiliate

Stéphane Maarek — Sat, 16 May 2020 20:38:47 GMT

Getting the Apache Kafka certification from Confluent is a great way of making sure to have your skills recognized by your current and future employees. But how do you prepare for that certification?

Certification

I personally find the CCOAK certification of medium difficulty, and if you have seriously used Apache Kafka in the past year, many questions shouldn’t be surprising. Nonetheless, it is best to be as prepared as you can, as the certification cost of 150 USD is not something you’d rather pay for twice.

I have created a practice exam course with 2 practice tests of 40 questions each… so 80 sample questions.. Completing those is a great first step to assess where you stand today. Additionally, I recommend checking out the sample questions from confluent at You will find three sample questions in this blog, read on!

Please also read the CCOAK certification FAQ at .

Exam Details

40 multiple choice questions in 90 minutes
Designed to validate professionals with a minimum of 6–12 months of Confluent experience
Remotely proctored on your computer
Available globally in English
You can currently take the certification straight from your computer, you don’t need to go to an exam center
The certification is valid for two years
Read the exam details at

Studying for the Certification

If you find lacking the knowledge, you may already know I have plenty of online content with the to help you learn Kafka at your own pace and with over 35 hours of videos. A countless number of my students did successfully pass the CCOAK certification exam after studying with my online courses.

Let’s deep dive into what you need to know for each exam topic:

Apache Kafka Fundamentals

You need to know everything about brokers, topics, partitions, offsets, producers and message keys, consumers and consumer groups, delivery semantics, Zookeeper, and more. My will help you get there

Kafka Setup

You’ll need to know the kinks of setting up a Kafka cluster, important configurations, and so on.

Kafka Security

Understanding some of the intricate details of Kafka Security will be helpful for your exam to address 2–3 questions.

Kafka Monitoring & Operations

Knowledge in these areas is key to pass the exam — it’s a Certified Confluent Operator exam after all.

Other knowledge

Kafka Streams, KSQL, Kafka Connect should be known at a high level
same goes for Confluent components at a high level: Confluent Schema Registry, Confluent REST Proxy, Confluent Replicator, Confluent Control Centre.
If you’re interested in these in a deep dive,

Sample questions

The following questions are an extract of my (responses at the bottom of the blog):

Q1: You have created a Producer but some messages have failed to be sent to Kafka on the first try. You have now included the setting `retries=5` and after careful analysis of your topic, you have seen that sometimes duplicates are sent by the producer. How can you solve that issue?

A: Set `enable.idempotence=true`
B: Set `retries=Integer.MAX_VALUE`
C: Set `retries=0`
D: Enable compression

Q2: A consumer inside a group is reading messages from Kafka. After doing `.poll(long)`, the consumer processes an unusually large batch of messages that got stuck on it for a long time. The consumer is then disconnected from the group. Which setting did trigger this behavior?

A: `session.timeout.ms`
B: `max.poll.interval.ms`
C: `max.poll.records`

Q3: What happens if you send a message to Kafka that does not contain any partition key?

A: The Kafka broker will refuse the message
B: The message will be sent as is to a random partition
C: The Kafka client will generate a random partition key and assign it to the message
D: The Kafka broker will generate a random partition key and assign it to the message

Closing Comments

Don’t rush your certification, especially if you’re just getting started with Apache Kafka, as I find the exam really great at testing your actual experience using Apache Kafka. Take your time to learn and practice, and answering the questions will seem natural.

If I have helped you get the certification, leave a comment here or

Happy learning and I wish you good luck with the CCOAK certification!

Quiz answers: 1A, 2B, 3B

Learn Apache Kafka like never before

My new learning resource, , is the quickest, easiest and most effective way for you to .

It’s absolutely free.
It’s open and there is no registration required.
It’s packed full of deep-dive lessons and practical tutorials, perfect for everyone from beginners to experts.

Mcb777 Login

Stéphane Maarek — Mon, 17 Feb 2020 18:20:11 GMT

Versus SAA-C01.

Today, I’m delighted to bring you a HUGE update to my .

I have updated 4 hours of content and added an extra 4 hours of content dedicated to knowledge for the SAA-C02 exam, bringing the total video time of the course to 22 hours.

Let’s do a deep dive!

In this article, you will find the list of topics I have added to the course to cover the new exam, as well as a FAQ.

Happy reading!

New & Updated Lectures

AWS Fundamentals: IAM & EC2
A deeper dive on Reserved Instances options, as well as how Spot Instances and Spot Fleets work. Finally, a word on what ENI is.

High Availability and Scalability: ELB & ASG
A full refresh of this section to include the latest information on CLB, ALB, and NLB, as well as clearly explain how some advanced features of ELB work. The ASG is also demonstrated to be used with Launch Templates, which are more powerful than Launch Configurations.

EC2 Storage — EBS & EFS
Understanding the IOPS implications of EBS volumes, as well as the IOPS implications of Instance Stores

RDS + Aurora + ElastiCache
A deeper dive into the differences of RDS Read Replicas as well as Multi-AZ, understanding the performance, DR and cost implications. A deeper dive into RDS IAM Authentication. Aurora is now explored in-depth, including some newer deployment options such as Aurora Serverless & Global Databases. In ElastiCache, we do a deeper dive on Redis vs Memcached.

Amazon S3 Advanced
An introduction to the new storage tier Glacier Deep Archive and how S3 lifecycle rules work in depth. A deeper look at S3 performance optimizations, S3 Select & Glacier Select, S3 Lock Policies & Glacier Vault Policies

CloudFront & AWS Global Accelerator
A better introduction to how CloudFront works, as well as the integration of CloudFront with different backends (not just S3), the security implications and advanced features. Finally, an introduction to a new awesome service of AWS called Global Accelerator, that will surely have a bigger place at the exam over the years.

AWS Storage Extras
Here, we discuss Storage Gateway, Snowball, Amazon FSx for Windows and for Lustre, as well as a comparison of all the possible storage options on AWS (because they are many!).

Decoupling applications: SQS, SNS, Kinesis, Active MQ
A deeper dive at how to use an ASG with SQS, and a better way to outline the differences between Kinesis Data Streams and Firehose.

Serverless Overviews from a Solution Architect Perspective
I added a CRON serverless architecture, as well as updating the Lambda limit to 15 minutes in the slide. I talk about DynamoDB Global Tables and on-demand capacity. Finally, I introduce SAM to you.

Databases in AWS
A deeper introduction to Redshift, including the backup options and Redshift Spectrum.

AWS Monitoring & Audit: CloudWatch & CloudTrail
An overview of AWS Config has been added.

Identity and Access Management (IAM) — Advanced
This section has been remodeled and includes clearer lectures on Security Token Service (STS), Identity Federation, Cognito, AWS Directory Services, AWS Organizations, Advanced IAM concepts, AWS Resource Access Manager (RAM), and AWS Single Sign-On (SSO).

AWS Security & Encryption: KMS, SSM Parameter Store, CloudHSM, Shield, WAF
More security services have been included such as CloudHSM, Shield & WAF.

Networking — VPC
Some more updates for NAT Gateways, VPC Peering, Direct Connect, PrivateLink, ClassicLink, VPN CloudHub.

Disaster Recovery & Migrations
Learning how to use Database Migration Service (DMS), On-Premise Strategies with AWS, AWS DataSync, and strategies to transfer large Datasets into AWS

More Solution Architectures
Event Processing in AWS, Caching Strategies in AWS, Blocking an IP Address in AWS, High-Performance Computing (HPC) on AWS.

Other Services
CloudFormation Stack sets and ECS Security.

As you can see, there’s plenty of content and lots of new things to learn!

You can get the course here:

Happy learning!

FREQUENTLY ASKED QUESTIONS (FAQ)

If I take the SAA-C01 exam before March 22nd, will be my certification expire?

No, if you take the exam before March 22nd, your certification will be valid for 3 years and will be the exam same certification as the people passing the SAA-C02 certification. Therefore, I encourage you if you can to pass the SAA-C01 certification.

Is the SAA-C02 certification harder? Will this course prepare me well?

In my opinion, the SAA-C02 has increased in difficulty and covers a wider range of AWS services. This is why I have refreshed 4 hours of content and add an additional 4 hours of new content. This course will, as usual, prepare you most optimally for the exam. In case you pass the exam and find that some things were missing from this course, please reach out to me.

I am preparing for the SAA-C01. Which lectures should I skip?

You can skip any lecture that starts with “SAA-C02” in the lecture title. Please note that after March 22nd, the SAA-C02 will be the only exam version available, and therefore I will change back the lecture names to their normal form without SAA-C02.

I am preparing for the SAA-C02. Which lectures should I skip?

None. All the lectures in this course are relevant for SAA-C02.

What are the new topics for SAA-C02?

You will find a list below, and you will have the lectures with “SAA-C02” at the beginning of the title.

In the new exam guide, there are only 4 sections for SAA-C02, and before there was 5. Shouldn’t you remove content?

There is indeed a section that has been removed from the SAA-C02 curriculum, but that doesn’t affect this course. This course has never been constructed around these sections, and instead, this course is service-centric. As such, there is no need to remove any lectures.

I already passed my SAA-C01 exam. Should I pass SAA-C02?

While I don’t think you need to repass the exam, unless you really enjoy doing it, I strongly suggest you have a look at the new and updated lectures. There is a lot of new cool content in AWS and it’s never a bad idea to keep yourself updated.

Are you updating the quizzes? What about practice exam questions?

I am updating the quizzes and the practice exam questions this week.

Why not make a separate course?

My goal is to make sure all my past and future students have access to the knowledge they need for their certification. I see this huge update as a gift to all of you, so I would really appreciate if you took the time to leave a review for the course, and recommend it to your colleagues and friends.

How did you know what to include in the new course revision?

In November 2019, I attended the beta exam of the SAA-C02. All the topics I can recollect are now taught in this course.

Should I redownload the slides and code?

Yes, please re-download the slides and code content from the Section 2 — Code and Slides Download.

Mcb777 APP

Stéphane Maarek — Thu, 28 Nov 2019 17:06:41 GMT

I’m starting a project to feature your stories

It’s pretty crazy — I’ve started on Udemy almost 3 years ago, and I’ve almost had in over 170 countries enrolled in my online courses. This is a mind-boggling number, a worldwide impact and I’m so grateful 🙏.

But I’m not interested in just the numbers. While it’s awesome and truly changed my life, I want to know how I’ve made a difference in some of my students’ lives, in small or bigger ways.

So I’ve decided that I want to have the opportunity to meet some of my students when I travel around the world.

I want to hear your stories, I want to meet you in person, bring a bit of life to something that is purely online…

I want to get to know you!

So, here’s the thing. I’m starting a public Instagram where I am going to post where I am in the world pretty regularly.

If you happen to be or live where I’m going, please reach out!

Let’s meet!

And I also want to post your stories on this Instagram. Have a few paragraphs about you, to share and have some impact. Encourage others. Here’s a story:

I once met a student of mine at the Kafka Summit, who told me he wanted to take a picture to send it to his wife. Why his wife, I asked him? He said that she took my Kafka courses while being pregnant and taking care of their newborn. And afterward was able to find a Kafka-related job. This almost brought tears to my eyes.

I’m excited about this. I really think this could be a great project! So, if you have a story, please reach out at stories@datacumulus.com with anything you want. We’ll take it from there 🙂

I wish you all a great holiday season!
Hope to meet you soon

Learn Apache Kafka like never before

My new learning resource, , is the quickest, easiest and most effective way for you to .

It’s absolutely free.
It’s open and there is no registration required.
It’s packed full of deep-dive lessons and practical tutorials, perfect for everyone from beginners to experts.

Machibet Live

Stéphane Maarek — Wed, 18 Sep 2019 17:47:25 GMT

A short case study of the tough act of balancing between OSS and proprietary

Disclaimer: This is my own analysis and is based on interpretation of public information. I’m trying to remain as impartial as possible and intend for this blog to be a case study around the OSS revenue model. If any corrections need to be made please let me know in the comments or on twitter

Apache Kafka has been open-source for over 8 years and will remain open-source forever. While open-source software (OSS) is free to use and resell for anyone, private corporations are spending huge quantities of money (the cost being their employees’ salaries) to maintain and improve open-source projects. Therefore these corporations must find a viable and healthy economical model around these open-source projects. This has been a topic of focus in the press recently.

Here’s my analysis on the strategy I think Confluent is following, and some recent developments. In the last part, I discuss the tough balance of choosing what is open-source and what isn’t.

Business Model 1: Sell support licenses for the open-source software.

This is something Confluent has been doing for a while. While Apache Kafka is free to use and setup, when set up within a corporation, it costs a lot to maintain as you usually need full-time employees to harden and monitor the infrastructure. On top of this, employees usually need to be well trained and will learn while on the job when encountering bugs or faults. As such, what makes sense is to rely on Confluent for support, because Confluent makes most of the contribution to the Apache Kafka project and will have the most expertise.

How viable is this model? It’s definitely possible for other companies to sell support, and this is something Cloudera (Hortonworks) and others are doing. Nonetheless, this licensing model works and applies to any Kafka setup: on-premise, in any cloud, on VM, on Kubernetes, or a mix.

Business 2: Software as a Service

Deploying software yourself, maintaining the infrastructure and paying for support may not make economical sense for every company. As such, corporations behind open source projects offer OSS as a service (MongoDB Inc., Elastic, etc…). In this “all in one package”, you get the software, an “elastic” consumption model, and usually some sort of tiered support based on your needs. Confluent Cloud is Confluent’s attempt at this and according to the statements made during the Kafka Summit of London 2019, business is going well.

How viable is this model? Selling open source software as a service is something ANY company can do, thanks to the permissive Apache 2.0 license terms. This is why if you need Kafka-as-a-Service, we have Aiven, CloudKarafka, Instaclustr, Confluent, and other players.

One very notable player is Amazon Web Services (AWS). AWS has a good track record of piggy-backing on OSS projects and selling them on their platform, usually being integrated with their VPC offering and possibly with IAM for security. AWS has had a track record doing so with Redis, ElasticSearch, PostgreSQL, and MySQL. The community and corporations backing these projects usually call for outrage as AWS has not had a good track record at contributing back to these projects. AWS seems to attempt to play nicer from now on, but time will tell. AWS also has the means to make their own proprietary software to mimic any API, such as Aurora for PostgreSQL (meant to compete with Oracle), or DocumentDB to compete with MongoDB.

Now, back onto Confluent vs AWS. AWS announced MSK as their own Kafka as Service offering a year back. This makes it easy for any of their customers to obtain a fully working Apache Kafka cluster in minutes. As a reaction to this, Confluent made some of their satellite open-source projects “source-available” to protect their IP and prevent AWS from using those directly (although that wouldn’t prevent AWS from re-implementing those with a compatible API, like mentioned). The change of license from open-source to “source-available” is what I’m discussing next.

Business 3: Freemium and weird licenses

Confluent has tried to build many satellite projects around Kafka. They started being open-source (REST Proxy, Schema Registry, KSQL) but most of them are now moved into the “source-available”. This new kind of license is specific to Confluent as far as I know, although most of the other companies (Elastic, MongoDB) have their own flavors of it; lawyers are hard at work. In Confluent’s case, this means in a nutshell that the code is available for everyone to see and use, although not resell as a service unless a special license is obtained. I don’t think Confluent has issued such a license or ever will. This blocks cloud providers such as AWS to offer managed Confluent Schema Registry or KSQL without doing a full re-implementation of these. Other companies such as Aiven circumvented this by re-implementing their own Schema Registry for example. As a customer of AWS though, you are free to deploy your own schema registry or KSQL set up on top of Amazon MSK, because you would own and manage the deployment of those.

While this won’t affect most companies, if you’re using the Confluent “source-available” products, this leaves you with “do-it-yourself anywhere” or Confluent Cloud as only options for migrations, which is in some ways a form of light lock-in.

Business 4: Proprietary software

This brings us to what I’d like to discuss in depth. Creating your own range of proprietary closed-source software is one of the best ways to thrive and survive as a company in the long run. Elastic has done that with their security plugins, and Confluent is doing the same with two products: their security offering and their global Kafka offering.

And here comes the fine game to play. As a private corporation that is VC-backed, what do you choose to commit to the OSS project versus do in-house? These decisions have a huge impact on reputation and profitability. Do too much of in-house and the OSS community feels left out. Do too much of OSS and competitors can make the business models described above work.

I’d like to discuss global Kafka. It is enabled through using a proprietary broker that is a “fork” of Apache Kafka, named “Confluent Server”. Using Confluent Server, one can have “observers” that are replicas not counted towards the ISR and acks. These replicas can be deployed in other regions and serve data to local consumers there.

How does it work?

This would not be possible without this KIP:
. This KIP is in the Apache Kafka project and therefore is open-source. It allows for better performance because consumers can fetch locally and usually save on a ton of cost if as an example you’re using multiple availability zones on AWS. It makes tons of sense, is a true improvement to the project, and also enabled for Confluent to offer Global Kafka. Fine line, one can see as a win-win for OSS and Confluent.
The “observers replicas”, as far as I know, are proprietary to Confluent Server but would work with vanilla Apache Kafka clients.
To improve the stability, scalability, and ease of operations of a Kafka cluster, it’d be great to replace a ZK cluster with a Kafka cluster. This is the goal of the KIP . That one is a huge piece of work, that I’d love to discuss on its own in another blog, that will take well over a year to be fully merged probably. It’s not new as it’s been discussed for over 4 years within the OSS community and it’s finally happening. I think overall it’ll be an improvement as it’ll enable Kafka to run more easily on Kubernetes, enable better security models, improve partition scalability and so on. It also enables the Global Confluent Server cluster to be more stable. Fine line again, a win-win for OSS and Confluent.

It’s clear these contributions to the OSS project do improve the Apache Kafka project in meaningful ways. It’s also apparent they enable Confluent to differentiate further their enterprise offering by offering proprietary features that probably won’t be rolled back into the OSS project. It’s not something new, the Confluent Replicator was also another proprietary piece of software meant to replace the shaky OSS “Mirror Maker”. Since then, Cloudera submitted a KIP to create Mirror Maker 2 in ways that probably are similar to how Confluent Replicator works.

Now it’s clear Confluent is at least on(c)e step ahead.

Take-away

Building a business around OSS is not easy and something companies struggle to do. I think Confluent building enterprise features in-house is a great model that will for sure seduce large corporations. For now, all the contributions into the OSS project have been useful to the Apache Kafka project and so they’re healthy. Still, one should be aware that more than 75% of the contributions to Apache Kafka is made by Confluent employees, and as such, I hope they’ll keep their impartiality in the future years. This would include letting other companies make KIP for features that compete directly with what Confluent is offering in-house, including global Kafka. Calling for volunteers, time will tell!

Learn Apache Kafka like never before

My new learning resource, , is the quickest, easiest and most effective way for you to .

It’s absolutely free.
It’s open and there is no registration required.
It’s packed full of deep-dive lessons and practical tutorials, perfect for everyone from beginners to experts.

Machibet777 Login

Stéphane Maarek — Tue, 30 Jul 2019 08:34:44 GMT

A simple explanation of what Confluent Kafka is and isn’t.

, and a very frequent and recurring question I get is:

How can I learn Confluent Kafka?

Let’s get right to it!

Apache Kafka and Confluent Kafka

Apache Kafka is the project hosted by the Apache foundation, at . It’s fully open source and maintained by the community. About 75% of the commits on the Apache Kafka project come from the private company Confluent, the rest are done by Hortonworks, IBM and other companies or independent contributors.

And… Confluent Kafka is… exactly the same as Apache Kafka!

Let me make this super clear: it’s exactly the same software.

The Confluent Platform

To be entirely accurate, Confluent adds a few custom classes that are metrics reporters as part of their Apache Kafka bundle, which helps them run tools such as Confluent Control Centre if you enable it in your Apache Kafka deployment.

As such, from a learning perspective, nothing changes. Calling it “Confluent Kafka” is a costly mistake. It’s just Apache Kafka.

So what’s Confluent Platform then?

Glad you asked! This is a better question.

Confluent adds a bunch of “source available” (for the most part) software to a Kafka deployment in order to add capabilities. This includes Schema Registry, the Avro serializers, KSQL, REST Proxy, etc.

Are these software solely working if you run the Kafka version distributed as part of the Confluent Platform? Absolutely not, they work equally well on the open source Apache Kafka!

How to learn Apache Kafka and Confluent?

I have an Apache Kafka for Beginners course I think you’ll love, in which you’ll learn all you need to know to get started with Kafka (and hence a great first step before dealing with the Confluent platform):
I have courses dedicated to learn the Confluent Platform components. Here they are: (where you’ll learn about the Avro serializers) and (Confluent’s wrapper around the Kafka Streams API).
Other courses that are not directly related to Confluent but still are very much applicable to Apache Kafka:

Happy learning!

Learn Apache Kafka like never before

My new learning resource, , is the quickest, easiest and most effective way for you to .

It’s absolutely free.
It’s open and there is no registration required.
It’s packed full of deep-dive lessons and practical tutorials, perfect for everyone from beginners to experts.

Mcb777 Affiliate

Stéphane Maarek — Wed, 30 Jan 2019 14:11:00 GMT

AWS Lambda applied to the real world: feeding me.

Delicious dessert

Currently living in Paris with my girlfriend and being big foodies, Septime was a restaurant we always wanted going to. It’s extremely popular as it’s currently ranked , and my girlfriend was describing to me the near-impossible experience to get a booking there (needing to have connections and all).

Wanting to verify this claim, I headed to the restaurant booking platform and couldn’t book. Surely, I can subscribe to the wait list, but do I really want to? No.

But one should never say something is “impossible” to a programmer. I was going to hack my way into Septime, doing what I do best: programming.

How to book Septime programmatically

One attentive eye would have noticed that the booking platform is not hosted on the restaurant website at but instead on .

Upon using the Chrome Web Developer Tools to analyze the network calls being made between my browser and the booking service, I stumbled upon an easy to use and completely unprotected REST API:

Omnomnom

After a bit of reverse engineering, I found out that two URLs are used in the booking system:

: to check all the available dates

: to check times for a specific date

Bottom line, finding an available booking time is as easy as checking a few REST API endpoints every minute or so… But I don’t want to keep my computer turned-on 24/7 and do that as a CRON job.

We live in 2019, I know better and I’ll use AWS Lambda!

AWS Lambda and the Serverless Framework

AWS Lambda is super trendy and allows you to build APIs, automations, and a bunch of cool stuff — including CRON jobs running in the cloud — in a serverless fashion. On top of it, I’ll run my Lambda function often enough while remaining in the free tier.

The architecture is super simple: use CloudWatch Events to trigger your Lambda function every minute, and write code that’s serverless friendly. Let’s start with the code

Simple architecture

The code

You can find the source code here:

The idea is simple: every time the AWS Lambda function is triggered, check all the REST API endpoints for every date (about 16 API calls) and analyze the output payload to see if any time is available. If so, send myself a Pushbullet notification and call it a day. But a few challenges await you if you’re running in AWS Lambda

Function Timeout

Doing 16 REST API calls will most likely make you go over the AWS Lambda timeout (default 3 seconds). Thankfully, we can increase it to up to 15 minutes, but in our case we’ll set this to 20 seconds using the Serverless framework:

functions:
  septime:
    handler: handler.septime
    timeout: 20
    memorySize: 128
    events:
    - schedule: rate(2 minutes)

Passing secrets

The Pushbullet API key is what could be considered a “secret”. So let’s do something somewhat secure and store it in the SSM Parameter Store and retrieve it when we run the serverless framework

functions:
  septime:
    handler: handler.septime
    timeout: 20
    memorySize: 128
    events:
    - schedule: rate(2 minutes)
    environment:
      PUSHBULLET_API_KEY: ${ssm:/septime/pushbullet_api_key~true}

Alternatively, one could encrypt it using KMS and decrypt it at runtime using Lambda, or retrieve directly from the SSM Parameter Store at runtime to Lambda (make sure your IAM permissions are correct then)

Handling State

When a booking time is available, I want to make sure my phone doesn’t get notified every 2 minute. I just want to retrieve my notifications once for every change in the booking state. Using the fact that my AWS Lambda function container will be kept warm as I call it often enough, I can store the state as a global variable:

# contains a cache of the previous results
previous_result = {}

def septime(event, context):
    global previous_result
    result = do()
    # only notify when the results have changed
    if result != previous_result:
        previous_result = result
        if result["resas"]:
            print("Sending notification to pushbullet")
            send_to_pushbullet(json.dumps(result), os.environ.get("PUSHBULLET_API_KEY"))
        else:
            print("no available resa")

Alternatively, I could have used a DynamoDB table with on-demand enabled in case the state is lost to recover it properly (on-demand for DynamoDB is a cool new feature from AWS Re-Invent 2018)

Packaging

As I’m relying on a requirements.txt file to package my python dependencies, it’s great to use the serverless python requirements () plugin to package my function properly for me:

plugins:
  - serverless-python-requirements

Deploying

Easy as 1..2..3 with the Serverless Framework:

sls deploy -v

Monitoring

It’s nice to know your AWS Lambda function is working properly using the CloudWatch dashboard for Lambda:

Great to see my function execution speed versus the timeout

The Results

Well, the results have been quite surprising… my Lambda function was working properly, but only returned results for lunch times (and weekdays, as the restaurant is closed on the weekend):

After running the program for over a month, I realized that Septime was being sneaky: you just can’t book night time tables using their online reservation system.

So, I had to revert to the last hack I had in mind… call the restaurant like a normal human being… which worked on first attempt. No AWS Lambda and Serverless needed, I got my booking and ate there a Friday night. Oh well, I had some fun with AWS Lambda…

I really look forward to using Google Duplex.

Closing Comments

Although my Lambda wasn’t the one making the booking for me, it provided me some insights into the booking patterns and I had ton of fun figuring out I actually had to call to make a booking for dinners (it’s not indicated on the website… sneaky sneaky).

If you’re inspired and you want to learn how to do use AWS Lambda for your own projects (small or huge scale), well I teach about it:

get started in no time to write your firsts AWS Lambda functions.
17-hours long course to understand how to leverage the cloud best as a developer. It’s awesome

Happy learning and eating!

Machibet Live

Stéphane Maarek — Mon, 28 Jan 2019 15:21:25 GMT

Certification

I personally find the CCDAK certification of medium difficulty, and if you have seriously used Apache Kafka in the past year, many questions shouldn’t be surprising. Nonetheless, it is best to be as prepared as you can, as the certification cost of 150 USD is not something you’d rather pay for twice.

I was part of the beta testers :)

I have created a practice exam course with 3 practice tests of 50 questions each… so 150 sample questions. . Completing those is a great first step to assess where you stand today. You will find five sample questions in this blog, read on !

Please also read the CCDK certification FAQ at: .

Important points are:

You can currently take the certification straight from your computer, you don’t need to go to an exam center
The certification exam is 55 multiple choice questions in 90 minutes
The certification is valid for two years
Read the exam details at:

Studying for the Certification

If you find lacking the knowledge, you may already know I have plenty of online content with the to help you learn Kafka at your own pace and with over 35 hours of videos. Countless number of my students did successfully pass the CCDAK certification exam after studying with my online courses.

Let’s deep dive into what you need to know for each exam topic:

Apache Kafka Fundamentals

You need to know everything about brokers, topics, partitions, offsets, producers and message keys, consumers and consumer groups, delivery semantics, Zookeeper, Java programming, and more. My will help you get there

Kafka Connect

You’ll need to know the basics of Kafka Connect includes Workers, Source Connectors and Sink Connectors, as well as converters.

Kafka Streams

You will need to know how Kafka Streams works, reading high level DSL topologies and understanding the difference between stateless and stateful operations.

Confluent Schema Registry / REST Proxy / KSQL

You will need to know the basics of the Confluent components (it is, after all, a Confluent certification), especially the Confluent Schema Registry and how Avro works. . For KSQL, reading a few recipes ahead of time may help: . You’ll quickly get the idea, it looks like SQL. To learn more about KSQL, check out my .

Kafka Setup

Surprising as it seems, you will get questions about Kafka heap sizing, and setup questions, so you will need to acquire knowledge there too.

Kafka Security and Kafka Monitoring?

For now, these aren’t covered at the exam in depth. My blog on Introduction to Apache Kafka Security will do.

Sample questions

The following questions are an extract of my (responses at the bottom of the blog):

Q1: You want to perform table lookups against a KTable every time a new record is received from a KStream. What is the output of KStream-KTable join?

A: KStream
B: KTable
C: GlobalKTable
D: You choose between KStream or KTable

Q2: In Kafka, every broker… (select three)

A: contains all the topics and all the partitions
B: contains only a subset of the topics and the partitions
C: knows all the metadata for all topics and partitions
D: knows the metadata for the topics and partitions it has on its disk
E: is a bootstrap broker
F: is a controller

Q3: If a topic has a replication factor of 3…

A: Each partition will live on 3 different brokers
B: Each partition will live on 2 different brokers
C: Each partition will live on 4 different brokers
D: 3 replicas of the same data will live on 1 broker

Q4: Your manager would like to have topic availability over consistency. Which setting do you need to change in order to enable that?

A: min.insync.replicas
B: compression.type
C: unclean.leader.election.enable

Q5: Using the Confluent Schema Registry, where are Avro schema stored?

A: In the Schema Registry embedded SQL database
B: In the _schemas topic
C: In the message bytes themselves
D: In the Zookeeper node /schemas

Closing Comments

If I have helped you get the certification, leave a comment here or

Happy learning, and I wish you good luck for the CCDAK certification!

Quiz answers: 1A, 2BCE, 3A, 4C, 5B.

Learn Apache Kafka like never before

My new learning resource, , is the quickest, easiest and most effective way for you to .

It’s absolutely free.
It’s open and there is no registration required.
It’s packed full of deep-dive lessons and practical tutorials, perfect for everyone from beginners to experts.

Mcb777 Live

Stéphane Maarek — Mon, 17 Dec 2018 14:01:00 GMT

The text version for those who like to read

I did a short while ago a video on the new features in Kafka 2.1, that you can watch below, but I wrote this blog to also allow you to read in your own time:

Kafka Upgrade Notes

Kafka 2.1 is quite a special upgrade because you cannot downgrade due to a schema change in the consumer offsets topics. Otherwise the procedure to upgrade Kafka is still the same as before, see:

Kafka Internals

Java 11

Kafka 2.1 is now available with Java 11! Java 11 was created in September 2018 and we get all the benefits from it, such as the Improved SSL and TLS performance (the improvements come from Java 9). According to one of the main Kafka committer, it is 2.5 times faster than Java 8.

Apache Kafka on Twitter

Secure Kafka running with SSL/TLS is 2.5x faster with Java 9 http://t.co/YWvaAOIWCB

We also get garbage collector improvement, G1 being the default collector and now can run in parallel. That means you’re more likely to have less long GC pauses.

Other Java 11 goodies include:

var allows you to reduce the amount of boilerplate code in your JDK applications. It is more easy to read as you can see in this example where we create a new producer:

var producer = new KafkaProducer(properties);

You also have Future improvements if you use Future.
Project jigsaw is now in Kafka. It allows you to have smaller compiled program binaries by modularizing Java and more efficient runtimes.

Fencing of Zombie Replicas (KIP-320):

This new KIP fixes a rare data loss condition and it is now completely gone.
You also get a new exception for the consumer when doing .poll(): OffsetOutOfRangeException

If you want to know how to deal with it, just read the KIP details it is well explained:

Kafka Clients

In this new version of Kafka, the biggest change by far is the Support for ZStandard Compression (KIP-110).

Support for Zstandard Compression (KIP-110)

This algorithm was created by Facebook in September 2016 and it has been for 2 years in the works. The compression ratio is as good as gzip on the first pass. But, if we now look at the compression and decompression speed, which means how many megabytes can be encrypted or decrypted per second, we have 5 times the performance of gzip.

In Kafka, according to the KIP, the compression works great, as we have an 4.28 compression ratio. Shopify is an example of a production environment using ZStandard and they noted a massive decrease in CPU usage. Overall, if you use ZStandard you should have more throughput for a fraction of the cost.

Here is how to use ZStandard compression:

Producers(2.1): compression.type=zstd
Consumers (<2.1): UNSUPPORTED_COMPRESSION_TYPE (error)
Consumers(>=2.1): It should be working out of the box

Also, to use zstd compression, you also have to update the brokers to 2.1. To summarize, if you want to use zstd, the first thing you have to do is update your consumers, then your brokers and finally your producers.

Link:

Avoid Expiring Consumer Offsets for Active Consumer Groups (KIP-211)

In previous versions of Kafka, there was a bug. Indeed, a consumer that was still active but not receiving data would lose its consumer offsets at some time, meaning there was a dormant producer or a down pipeline. If there was a restart or a rebalance, offsets would be reset, which would result either on data loss or duplicates reads (based on auto.offset.reset setting). It has been fixed in Kafka 2.1. Now, an active consumer will not get its consumer offsets reset if it is active. It is one of the reasons why we cannot downgrade to a previous version after upgrading our brokers to 2.1.

Link:

Intuitive Producer Timeout (KIP-91)

In Kafka, there was never an easy setting to provide a timeout for an entire send request. Indeed, we had to use four settings: max.block.ms , linger.ms , retry.backoff.ms, request.timeout.ms , all those settings linked up one to another

Now this has been fixed, by wrapping all these settings in a new setting called delivery.timeout.ms which provides an upper bound on how long you will wait until a message is delivered.

Here is how to use delivery.timeout.ms : by default, it’s quite a big number of 120000ms (120 seconds, 2 minutes). However, if you set it to Integer.MAX_LONG , you can delegate the retry mechanism to Kafka (wait indefinitely) which is quite nice.

Also, inside this KIP, there was a big change for retries. It is now defaulted to Integer.MAX_INT, instead of 0. Now you can get out of order data if the sends get retried unless you do this:

You set max.in.flight.request.per.connnection=1 and that’s perfect because even with retries there is no out of order, but you lose in throughput
You set enable.idempotence=true (which allows max.in.flight.request.per.connection=5 ) and not have any out of order data, while making sure you get maximum throughput.

Link:

Kafka Streams

Kafka Streams had lots of small changes

KIP-358: Kafka Streams has migrated to using Duration instead of longMs times. Thus, you might get a lot of deprecations when you upgrade your Kafka Streams to 2.1.
KIP-353: Kafka Streams had complicated the logic to select the “next record to process”, especially when there were multiples input partitions to read from. The algorithm used to be quite hard and non-deterministic if you use the old Kafka Streams. There is now a new setting, max.task.idle.ms , which allows you to control how long you are willing to wait for the synchronization of timestamps across different partitions. It is default to 0 ms to favor latency. If you want to test out that setting and want to introduce a little bit of latency, at the benefits of maybe synchronizing timestamps better between partitions, then put the setting to something a little bit higher.
KIP-319: Internal fix when using WindowByteStore .
KIP-321: There was the possibility to extract the TopologyDescription at runtime (KIP 120), but this has been improved in KIP 321.
KIP-330: This new version adds a missing interface to get the retention period of a SessionBytesStoreSupplier .
KIP-356: Kafka 2.1 adds a withCachingDisabled() function to StoreBuilder to complete the existing withCachingEnabled() function. Nevertheless, caching is still disabled by default.

Kafka Security

In my mind, Kafka 2.1 bring lots of really amazing changes regarding security.

DNS — KIP-235: You can now resolve CNAME before authentication using client.dns.lookup=true . This allows you to be a bit more smart around how you use DNS with Kafka while keeping security guarantees.

DNS — KIP-302: Now, if a DNS record returns multiple IP, the clients have a new setting to resolve and to try all the IP instead of just the first one client.dns.lookup=”use_all_dns_ips” . This new feature allows us to create some complex DNS records with a round robin and only give this to our clients, instead of a longer bootstrap.servers

ACL — KIP-231: ListGroups API now recommends to use Describe(Group) ACL instead of Describe(Cluster) ACL.

Kafka Administration

The admin improvements in Kafka 2.1 are mostly around CLI (Command Line interface) improvements. Here is a list of these improvements:

KIP-308: There is a new CLI name kafka-get-offsets.sh to get offsets for multiple topics at a time including partitions. (edit: actually did not make it to 2.1.0, should be 2.2.0)

KIP- 338: You now have the ability to exclude internal topics in kafka-topics.sh using the--exclude internal option, which was not achievable before unless you used some complex RegEx by parsing the output.

KIP-340: kafka-reassign-partitions and kafka-logs-dirs CLI now accept a file of properties file input. The producer and the consumer already had that logic, but it is now implemented for others CLI.

KIP-324: It is now possible to get the AdminClient metrics using a .metrics() function the same way we can do it for producers and consumers already.

Closing Comments

Overall, I would say that this upgrade is a really good one. It has mostly been around stability, but nevertheless, I strongly recommend that you upgrade your Kafka software to this version.

If you want to learn more about Kafka 2.1, please go check out my online courses:

You can also , , and

Happy learning!

Learn Apache Kafka like never before

My new learning resource, , is the quickest, easiest and most effective way for you to .

It’s absolutely free.
It’s open and there is no registration required.
It’s packed full of deep-dive lessons and practical tutorials, perfect for everyone from beginners to experts.

Mcb777 Login

Stéphane Maarek — Fri, 30 Nov 2018 14:26:00 GMT

UPDATED ARTICLE

This article was written when Amazon MSK was released in beta form and since then MSK has tremendously improved.

I have created to help you learn it and I’ve updated this blog as Amazon MSK is now production-ready.

Why you would want to use Amazon MSK

Amazon MSK is one of the best way to deploy Apache Kafka in your AWS VPC securely and quickly. The main advantages that you will get are

Managed service: you don’t have to bring together an entire engineering team together to setup Apache Kafka. You can start building your applications in less than 15 minutes
Network security: Apache Kafka on Amazon MSK is deployed within your VPC, meaning that Apache Kafka network packets never goes out on the internet. This is a big difference from public managed solutions such as Confluent Cloud.
Kafka security: MSK supports SSL based security and SASL/SCRAM. I’ve setup Kafka security before, and I can tell you it’s error prone and hard. You can directly use a secure Kafka cluster on MSK.

you do not pay for Apache Kafka replication traffic going across your AZ

Cost savings: one HUGE advantage of using Amazon MSK is that you do not pay for Kafka replication traffic going across your AZ. If you are going to run Apache Kafka on EC2 machines yourself, and set a replication factor of 3, the network bill can become pretty significant if you have huge data volumes. There’s a to compute your potential cost savings when using MSK
Managed upgrades: one simple API to upgrade your Kafka cluster with no downtime.

The rest is a full Apache Kafka experience. You can still customize settings if you’re an advanced user, it is using the standard Apache Kafka distribution and therefore all your Kafka Streams, Kafka Connect, or any Kafka applications will still work.

Amazon did create some nice AWS service integrations with MSK:

You can use the Glue ETL service to run a managed Apache Spark job directly connected to your Amazon MSK cluster
You can use the Kinesis Data Analytics service to run a managed Apache Flink job directly connected to your Amazon MSK cluster
You can use Lambda functions (!) to create Kafka consumers and react to data flowing through your Kafka topics. Very, very neat
AWS Certificate Manager is be used to provide SSL certificates for your clients.

Conclusion

Amazon MSK is now a very good solution to implement Apache Kafka on AWS. I am recommending it to my clients for its ease of use.

If you want to learn Amazon MSK, I’ve created a and as a thank you for reading this article, use the coupon code MEDIUM15to get a 15% discount at checkout ✌️

And if you want to learn how Apache Kafka works, my other tutorials at should help! Happy learning :)

Learn Apache Kafka like never before

My new learning resource, , is the quickest, easiest and most effective way for you to .

It’s absolutely free.
It’s open and there is no registration required.
It’s packed full of deep-dive lessons and practical tutorials, perfect for everyone from beginners to experts.