No images? Click here

Data Analytics Newsletter #34
The first rule of machine learning, ethics
and identifying snake oil

 

The Data Analytics Practice Committee (DAPC), the Young Data Analytics Working Group (YDAWG) and the Actuaries Institute are pleased to bring you the latest in the world of Data Analytics to your inbox, and to share some of our recent work with you.

In this newsletter, we look at everyday data science work, data privacy and ethics, and a way to drill into and data mine a CSV file with only your browser. But first, can any Fellows help?

Contents

1. Development

  • "Data science is boring"
  • AI regulation is coming 

2. Fun

  • Using R to visualise where you can meet within km restrictions

3. Strategic

  • How to identify AI snake oil
  • The first rule of machine learning: Start without machine learning
  • AI-generated data science with OpenAI codex

4. Ethics

  • The privacy act is changing
  • Data privacy: traditional approaches and machine learning?
  • 11 short videos about AI ethics

5. Tools

  • Jupyter notebooks: Now an App
  • Interrogate CSV with SQL - in browser!

1. Development

"Data science is boring"

For new entrants to the profession, data science is a glamorous area promising interesting problems, experiments and advanced machine learning models. However, this video interview tears down the glamour and discusses how data scientists utilise their work hours in practice.
 

AI regulation is coming

For years, public concern about technological risk has focused on the misuse of personal data. But as firms embed more and more artificial intelligence in products and processes, attention is shifting to the potential for bad or biased decisions by algorithms. Inevitably, many governments will feel regulation is essential to protect consumers from that risk.

This article from Harvard Business Review explains the moves regulators are most likely to make and the three main challenges businesses need to consider as they adopt and integrate AI.

2. Fun

Using R to visualise where you can meet within km restrictions

Apologies for the link error last time - again, here is ‘Where Do We Meet’ - a simple interactive tool built in RShiny showcasing how ‘sf’ for geometric operations and Leaflet for maps can find available outdoor facilities within e.g. 5km or 10km of two addresses.

 

3. Strategic

How to identify AI snake oil

AI has delivered spectacular results in specific applications. Unfortunately, this has led to the ‘AI’ label becoming a marketing tool, and firms making unrealistic promises of model performance that at best, disappoint, and at worst, create biased and discriminatory outcomes.  

This presentation from Princeton University provides a heuristic about what AI is presently capable of, and some warning signs as to whether a proposal may be ‘snake oil’.

The first rule of machine learning: Start without machine learning

In the same vein - for small to medium data, machine learning can often have limited advantages and a simpler model may be not only just as predictive, but more transparent as well. A warning while reading the article; ethics-conscious actuaries should resist the temptation to sell snake oil instead.

Read it here.

AI-generated data science with OpenAI codex

Previously the newsletter mentioned GitHub Copilot, which uses AI to automate programming and generate code. This video shows a demo for data science coding by OpenAI of Codex, the model that powers the tool. Based on ‘Plot the results. Label both axes (y axis is max temperature), rotate the x ticks, and add a title’, the model is able to generate working code. Human oversight is still needed, but it appears AI has the potential to improve productivity for data scientists too.

4. Ethics

The privacy act is changing

The privacy act is changing – how will this affect your machine learning models and what can you do? This article looks at changes to the Australian privacy act currently under consideration and how to address the potential impacts for industry and consumers.

 

Data privacy: traditional approaches and machine learning?

 

The continuation of the previous article, exploring the potential to apply data privacy approaches used in traditional settings to machine learning models. It describes the challenge with data privacy, and how “differential privacy” and emerging “unlearning” techniques can help.

Read more.

11 short videos about AI ethics

 

Looking for bite-sized introductions to AI and data ethics topics? This collection of 11 videos covers topics such as bias, over-reliance on KPIs, and diversity - in clips of less than 15 minutes long.

Watch them here.

 
 

5. Tools

Jupyter notebooks: Now an App

For notebook aficionados, Jupyterlab App is now available from the Jupyter organisation - a standalone, self contained app that includes Python, popular libraries and Jupyterlab. For those using Jupyter Notebooks, this may be a significant upgrade. Download links to the installers can be found under ‘Download' here.

 

Interrogate CSV with SQL - in browser!

 

Investigating a new dataset that’s completely foreign to you and don’t know where to start? Or wanting to play around with SQL without installing anything?

Try sqliteviz - a simple and intuitive web app that allows you to visualise your data with a wide range of interactable charts within seconds. The data can be read directly from an SQLite database or imported from CSV.

 

Manipulating the data is also easy to do within the app by submitting SQL queries. Moreover, the computations are fully client-side, which means no data will ever leave your computer.

 
 
 

Editors' note

Regulation is always evolving to match emerging technology trends. Actuaries who are aware of the trends can help advise businesses on how to steer clear of pitfalls.

As usual, check out Actuaries Digital for more things to read and the microsite for great learning resources and past Newsletter editions.

Jacky Poon, Henry Ma and Grant Lian
Editors, Data Analytics Newsletter

 

Disclaimer: The Institute wishes it to be understood that any opinions put forward in this publication are not necessarily those of the Institute.

FacebookTwitterYouTubeInstagramLinkedInWebsite


Actuaries Institute
Level 2, 50 Carrington Street
Sydney NSW 2000, Australia
t +61 (0) 2 9239 6100

This email may contain privileged and/or confidential information. If you are not the intended recipient, please delete the email and notify the Actuaries Institute immediately on +61 (0) 2 9239 6100 or by return email. You must not disseminate, copy or take any action in reliance on the email. Neither any privilege nor confidentiality in the contents of this email is waived, lost or destroyed by reason that it has been transmitted other than to the intended addressee. If you send an email to us (including any emails addressed to a staff email address) the information in your email (including any ‘Personal Information’ as defined in the Privacy Act 1988 (Cth)) may be retained on our systems in accordance with our Privacy Policy and applicable data retention procedures.

 
 
  Forward 
Preferences  |  Unsubscribe