Whether you’re interested in computer science, a professional data scientist, or just someone looking for applications that you can use Python for, PyCon is a great place to get information and learn a lot from knowledgeable and skilled speakers. Python is one of the best known and most widely used programming languages today millions developers and users worldwide.
Because Python is used internationally, mall communities were built to connect developers and networks and share experiences. One of the biggest annual Python-related events is PyCon (Python Conference). PyCon is not just one conference; it is a collection of conferences around the world, the largest of which is held in the United States.
PyCon often includes discussions, guides, and workshops on various Python-related topics at all levels from beginner to expert. So whenever I want to learn – or update my knowledge – about all of Python, I first try to search for relevant discussions from previous PyCons before going to Google and looking for different tutorials and information.
In my experience, PyCon conversations are often short – less than 30 minutes – concise, and amazing, successful people present them in a fun and simple way. Despite the fact that these discussions are short and concise, they often contain all the details you need to know about the subject in order to have a solid understanding of the subject.
This year’s (2021) PyCon USA is already done, and like many other conferences held this year, it was completely virtual. The conference included plenty of eye-opening discussions, simple-to-follow workshops, and helpful tutorials. All PyCon videos are now available on YouTube. Although I would recommend that you read all of PyCon’s materials, in this article I will focus on the data science discussions I participated in and learned a lot from.
Let’s start with a speech by two software designers, Randall Hunt and Mike Ruberry, who have experience with big technology companies like Facebook, AWS cloud and SpaceX. Hunt and Ruberry talked about how NumPy works in PyTorch and whether PyTorch is really NumPy compliant, and how you can address this compatibility issue to ensure smoother performance.
AnyScale software designer Simon Mo talks about the hassle of implementing machine learning models for production. As data scientists, building and training machine learning models should be part of where you spend most of your time. Mo goes through the implementation of your model more effortlessly. He also discusses the process of implementing a machine learning model Ray Serve, which is a scalable model that serves as a frame.
Marina Shvartz is an artificial intelligence software designer at Aidoc Medical. Shvarts deals with the struggle to perform effective testing with different artificial intelligence models when we can’t set their exact edge cases manually. Shvartz talks about asset-based testing, a hypothesis library, and how it can help data scientists create edge cases that can help produce well-tested and developed artificial intelligence models.
SangBin Cho, another software designer from AnyScale, continues Simon Mo’s talk about introducing machine learning models. In Choi’s speech, you’ll learn more about how Ray Serve handles the challenge of scalable data science Python applications. He talks about the challenges they faced in developing Ray and how they overcame them to support the handling of large data sets.
Apache Kafka is one of the best known data streaming platforms. Francesco Tisiot, the developer of Aiven, will help you explore what Apache Kafka is capable of and what problems it can solve. Tisiot provides some tips for adding and using Kafka with Python libraries, and then introduces Kafka Connect, a tool for merging events to take the application to the next level of knowledge.
We live in an expanding world of information; the amount of data processed and processed by our application is growing rapidly. Jenna Conn and Hannah Cline, both software designers, discuss different methods you can use with a Python library celery to sort and optimize data by creating queues for your application and a better overall experience.
Kevin Kho, Perfect’s open source community engineer, and former data scientist discuss the data validation process with Spark and Dask, focusing on large-scale data. Kho looks at the challenge of validating different sections of data and how this challenge can be overcome by combining tools correctly. Kho explains that using Panan validation and how it can be done more efficiently with Spark, Dask and Fugu.
One of the most useful and reliable sources for everything related to Python is PyCon discussions, tutorials, and workshops. PyCon is a series of Python conferences organized annually by the Python community of volunteers and held worldwide.
The first and most important PyCon is the US version, which is often held in the first half of the year. PyCon US 2021 was held in May and included hundreds of discussions and tutorials focusing on different aspects of Python and targeting people from all levels of experience. Even if you are new to the world of programming and Python, I can guarantee that you will find a PyCon speech that matches your level and gives you useful information.
Python is a very versatile language that can be used in many applications, but one of the most common application areas in Python is data science. In this article, I proposed 7 amazing PyCon US 2021 discussions related to data science that are full of useful information for researchers of all levels.
In addition to the discussions I have suggested in this article, I recommend going through the entire list of PyCon negotiations, you may find yourself an interesting speech, but I did not mention this list.