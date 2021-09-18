CreatorsPublishersAdvertisers
What’s fancy about context transition in DAX?

Cover picture for the articleRow and filter context are well-known concepts in DAX. But we can switch between these two with context transition. Let’s look at what we can do with it. In this article, I show you what context transition is in DAX and how to use it. I will cover only the...

towardsdatascience.com

What is bias?

The AI bias trouble starts — but doesn’t end — with definition. “Bias” is an overloaded term which means remarkably different things in different contexts. Here are just a few definitions of bias for your perusal. In statistics: Bias is the difference between the expected value of an estimator and...
COMPUTERS
towardsdatascience.com

Thinking, fast and slow: AI edition

Francesca Rossi on paradigms for general intelligence. Editor’s note: The TDS Podcast is hosted by Jeremie Harris, who is the co-founder of SharpestMinds, a data science mentorship startup. Every week, Jeremie chats with researchers and business leaders at the forefront of the field to unpack the most pressing questions around data science, machine learning, and AI.
COMPUTERS
towardsdatascience.com

Introduction to Marketing Mix Modeling in Python

To keep a business running, spending money on advertising is crucial — this is the case regardless of whether the company is small or already established. And the number of ad spendings in the industry are enormous:. These volumes make it necessary to spend each advertising dollar wisely. However, this...
SOFTWARE
towardsdatascience.com

How to Generate Automated PDF Documents with Python

Leveraging automation to create dazzling PDF documents effortlessly. When was the last time you grappled with a PDF document? You probably don’t have to look too far back to find the answer to that question. We deal with a multitude of documents on a daily basis in our lives and an overwhelmingly large number of those are indeed PDF documents. It is fair to claim that a lot of these documents are tediously repetitive and agonizingly painful to formulate. It is about time we consider leveraging the power of automation with Python to mechanize the tedious so that we may reallocate our precious time to more pressing tasks in our lives.
CODING & PROGRAMMING
towardsdatascience.com

A Reply to “TensorFlow Sad Story”

Today I read “TensorFlow Sad Story”, from Zahar Chikishev and, while I agree with most of his points, I felt it was a bit unfair with TensorFlow itself, as it did not try to cover or understand why TensorFlow is as it is or why on Earth would Google do certain things. Being a user of TensorFlow since 2.0 came out in September 2019 and having felt many of the same pains Zahar described in his article, I started drafting a reply to him. However, as you can see from this article, my comment grew out of control, so I decided to turn it into a separate piece.
CODING & PROGRAMMING
towardsdatascience.com

Custom dataset in Pytorch —Part 1. Images

Pytorch has a great ecosystem to load custom datasets for training machine learning models. This is the first part of the two-part series on loading Custom Datasets in Pytorch. In this walkthrough, we’ll learn how to load a custom image dataset for classification. The code for this walkthrough can also be found on Github.
CODING & PROGRAMMING
towardsdatascience.com

6 Linux Commands for Data Scientists

Terminal commands to glance your data at your fingertips. The GNU Core Utilities (coreutils) is a package of command utilities for file, text, and shell. It has more than a hundred commands. In this article, you will find six GNU Coreutils commands that are useful for dealing with text, CSV,...
CODING & PROGRAMMING
towardsdatascience.com

Exploring Datacommons, the API powering statistical queries on Google Search

(Disclaimer: I work as a Product Manager at Datapane) Ever wondered how Google is able to give such accurate responses to questions like “What is the median income in San Francisco?”. The statistical queries on Google Search are powered by DataCommons, an open-source repository of publicly-available datasets. DataCommons provides a...
SOFTWARE
towardsdatascience.com

How to build a Machine Learning (ML) based Predictive System

A practical data science guide to develop a prediction model which classifies customers into two satisfaction classes. We all know that customer satisfaction is a key to boost company’s performance, but organizations still strive to utilize the increasing availability of data to satisfy customers. In this article, I illustrate how machine learning and data science techniques can be employed to assess and evaluate customer satisfaction. I present the necessary steps to develop customer-driven prediction models, starting from problem framing, to data exploratory analysis, data transformation, ML training, and recommendations.
SOFTWARE
towardsdatascience.com

Backpropagation: The Natural Proof

You might scroll down to the third image to jump straight to it. What sets artificial neural networks apart from other machine learning algorithms is how they assume very little about your dataset. Your neural network doesn’t care if your classification data isn’t linearly separable via a kernel or if...
COMPUTERS
towardsdatascience.com

Parallelize your python code to save time on data processing

Annoyed by the long waits while processing your data? This blog is for you!. Have you ever faced a situation when you have to wait for a long time while processing your data? Honestly, it happens to me a lot. So to reduce my pain a bit, I make sure to use all the computing resources available to humankind to minimize this wait.
CODING & PROGRAMMING
towardsdatascience.com

Five questions that will help you model integer linear programs better

A structured way to formulate real world problems as mathematical models, like the Knapsack problem. You might have heard about classical mathematical problems, such as the Travelling Salesman Problem or the 0/1 Knapsack problem. There are several options to solve such optimization problems, but the most basic one is trying to find the exact solution. For this purpose, most mathematicians apply integer linear programming, ILP in short. When I was introduced to this in a university course, it was very confusing. Usually the professor would give us an elaborate problem statement, which could be boiled down to an ILP containing less than ten lines. The trick is to make the conversion from such a real life problem to a mathematical model. Together with my classmates I found it quite challenging to do so. Fortunately along the way we developed a list of five questions which enabled us to analyse the problem in a structured way. Even more: it made writing down the actual model much easier. In this article I will explain this approach in detail in order to help you with modeling your next ILP. This will be done by applying it immediately to a real-life problem: the 0/1 Knapsack problem. To start, here are the five questions:
MATHEMATICS
towardsdatascience.com

An Understandable Introduction To C

Likely one of the most important programming languages ever created for the history of computing is the C programming language. This language really changed the world of computers for the better in so many ways, and still plays a vital role in the world of computing today. No matter what system you are currently reading this article on, unless you printed this out for some reason, you are using C code as we speak. That being said, it is easy to understand why learning the C language could be beneficial towards your domain if you do work in any technological field. This of course includes software engineering, as well as Data Science, but those are not the only disciplines that I think should be involved with the C programming language. There is also a video that I created that might be more digestible content for some.
CODING & PROGRAMMING
towardsdatascience.com

Wavelet Transforms in Python with Google JAX

Wavelet transforms are one of the key tools for signal analysis. They are extensively used in science and engineering. Some of the specific applications include data compression, gait analysis, signal/image de-noising, digital communications, etc. This article focuses on a simple lossy data compression application by using the DWT (Discrete Wavelet Transform) support provided in the CR-Sparse library.
CODING & PROGRAMMING
towardsdatascience.com

MLOps without Much Ops

If you do not work for Big Tech— the Googles, Facebooks, Amazons of this world — , chances are that you work for a “reasonable scale” company. Reasonable scale companies aren’t like Google. They can’t hire all the people they dream of and they don’t serve billions of users per day from a cloud infrastructure they own. Reasonable scale companies process millions of data points, not billions; they can hire dozens of data scientists, not hundreds, and they have to optimize for their computing costs.
TECHNOLOGY
towardsdatascience.com

Linear programming with Python and Julia

“True optimization is the revolutionary contribution of modern research to decision processes.” — George Dantzig. I was intrigued by the concept of optimization when I attended the course Operations Research (OR) during my undergraduate studies in Mechanical Engineering half a decade ago. The main reason this course was fascinating to me was that it dealt with solving real-world problems such as optimizing the workflow in a factory, supply chain management, scheduling flights in an airport, travelling salesman problem, etc. Operations Research deals with how to make decisions efficiently through the use of different mathematical techniques or algorithms. In a real-world setting, this could mean maximizing (profit, yield) or minimizing (losses, risks) the given expression while satisfying the constraints such as costs, time, and resource allocation.
CODING & PROGRAMMING
towardsdatascience.com

The Mystery of Feature Scaling is Finally Solved

Principle Researcher: Dave Guggenheim / Co-Researcher: Utsav Vachhani. For some machine learning models, feature scaling is an important step in data preprocessing. Regularized algorithms (e.g., lasso and ridge penalties), distance-based models (e.g., k-nearest neighbors, clustering, support vector machines, etc.), and artificial neural networks all perform better when the predictors are on the same scale or within the same boundaries. But feature scaling can be much more than inducing conformity; it can be a powerful addition to your predictive modeling toolbox.
COMPUTERS
towardsdatascience.com

Destroying Every Programming Concept You Know With Julia

That being said, Julia — for my applications, typically blows the typical look one might take at a paradigm completely out of the water. Before we get into what I am mentioning, let us quickly go over the subject of programming paradigms in case any of my readers are not familiar with the concept.
CODING & PROGRAMMING
towardsdatascience.com

Understanding Machine Learning Models Better with Explainable AI

Building an interactive dashboard in few lines of code with ExplainerDashboard. It is interesting to decipher the working of Machine Learning through a web-based dashboard. Imagine gaining access to the interactive plots displaying information on model performance, feature importance as well as What-if analysis. What is exciting is that one does not need any web development expertise to build such an informative dashboard but simple few lines of python code are sufficient to generate a stunningly interactive Machine Learning Dashboard. This is possible by using a library called ‘Explainer Dashboard’.
CODING & PROGRAMMING
towardsdatascience.com

No, you don’t need a holdout group

I’m a co-founder at Aampe, where we embed contextual learning algorithms into mobile apps’ push notifications to learn about and adapt to individual user preferences. We do a lot of tests and a lot of experimental design, as well as a fair amount of machine learning. This post is about a particular request we often get from potential customers: that we hold out a subset of users who will have neither the content nor the timing of their notifications chosen by Aampe’s learning systems. This request seems to stem from the mistaken belief that a holdout comparison is somehow inherently a “scientific” practice.
COMPUTERS

