Validation

DATASETS RELIABILITY & CORRECTNESS

1. Clever Hans effect - in relations to cues left in the dataset that models find, instead of actually solving the defined task!
    Ablating, i.e. removing, part of a model and observing the impact this has on performance is a common method for verifying that the part in question is useful. If performance doesn't go down, then the part is useless and should be removed. Carrying this method over to datasets, it should become common practice to perform dataset ablations, as well, for example:
    Provide only incomplete input (as done in the reviewed paper): This verifies that the complete input is required. If not, the dataset contains cues that allow taking shortcuts.
    Shuffle the input: This verifies the importance of word (or sentence) order. If a bag-of-words/sentences gives similar results, even though the task requires sequential reasoning, then the model has not learned sequential reasoning and the dataset contains cues that allow the model to "solve" the task without it.
    Assign random labels: How much does performance drop if ten percent of instances are relabeled randomly? How much with all random labels? If scores don't change much, the model probably didn't learning anything interesting about the task.
    Randomly replace content words: How much does performance drop if all noun phrases and/or verb phrases are replaced with random noun phrases and verbs? If not much, the dataset may provide unintended non-content cues, such as sentence length or distribution of function words.
​2. Paper

UNIT / DATA TESTS

    1.
    A great :P unit test and logging post on medium - it’s actually mine :)
    2.
    A mind blowing lecture about unit testing your data using Voluptuous & engrade & TDDA lecture
    3.
    ​Great expectations, article, β€œTDDA” for Unit tests and CI
    4.
    ​DataProfiler git​
    5.
    7.
    ​Unit tests asserts​
    10.
    A good pytest tutorial​
    11.
    ​Mock, mock 2​

REGULATION FOR AI

    1.
    2.
    ​EU regulation DOC​
    3.
    ​EIOPA - regulation for insurance companies.
    4.
    Ethics and regulations in Israel
      1.
      ​First Report by the intelligence committee headed by prof. Itzik ben israel and prof. evyatar matanya
      3.
      Third by meizam leumi for AI systems in ethics and regulation in israel, lecture​

FAIRNESS, ACCOUNTABILITY & TRANSPARENCY ML

    1.
    FATML website - The past few years have seen growing recognition that machine learning raises novel challenges for ensuring non-discrimination, due process, and understandability in decision-making. In particular, policymakers, regulators, and advocates have expressed fears about the potentially discriminatory impact of machine learning, with many calling for further technical research into the dangers of inadvertently encoding bias into automated decisions.
At the same time, there is increasing alarm that the complexity of machine learning may reduce the justification for consequential decisions to β€œthe algorithm made me do it.”
    2.
    ​FAccT - A computer science conference with a cross-disciplinary focus that brings together researchers and practitioners interested in fairness, accountability, and transparency in socio-technical systems.
    5.
    ​Bengio on ai​
    6.
    ​Poisoning attacks on fairness - Research in adversarial machine learning has shown how the performance of machine learning models can be seriously compromised by injecting even a small fraction of poisoning points into the training data. We empirically show that our attack is effective not only in the white-box setting, in which the attacker has full access to the target model, but also in a more challenging black-box scenario in which the attacks are optimized against a substitute model and then transferred to the target model

​

FAIRNESS TOOLS

    1.
    2.
    ​Fair-learn A Python package to assess and improve fairness of machine learning models.
    ​
    3.
    ​Sk-lego​
    1.
    Regression
    ​
    2.
    classification
    ​
    3.
    ​
    ​
    4.
    ​
    ​
    5.
    ​
    ​
    6.
    information filter
    ​
M. Zafar et al. (2017), Fairness Constraints: Mechanisms for Fair Classification
M. Hardt, E. Price and N. Srebro (2016), Equality of Opportunity in Supervised Learning

​

INTERPRETABLE / EXPLAINABLE AI (XAI)

    2.
    Interpretability and Explainability in Machine Learning course / slides. Understanding, evaluating, rule based, prototype based, risk scores, generalized additive models, explaining black box, visualizing, feature importance, actionable explanations, casual models, human in the loop, connection with debugging.
    4.
    ​explainML tutorial​
    6.
    From the above image: Paper: Principles and practice of explainable models - a really good review for everything XAI - β€œa survey to help industry practitioners (but also data scientists more broadly) understand the field of explainable machine learning better and apply the right tools. Our latter sections build a narrative around a putative data scientist, and discuss how she might go about explaining her models by asking the right questions. From an organization viewpoint, after motivating the area broadly, we discuss the main developments, including the principles that allow us to study transparent models vs opaque models, as well as model-specific or model-agnostic post-hoc explainability approaches. We also briefly reflect on deep learning models, and conclude with a discussion about future research directions.”
    8.
    (great) Interpretability overview, transparent (simultability, decomposability, algorithmic transparency) post-hoc interpretability (text explanation, visual local, explanation by example,), evaluation, utility.
    1.
    ​Paper: pitfalls to avoid when interpreting ML models β€œy. A growing number of techniques provide model interpretations, but can lead to wrong conclusions if applied incorrectly. We illustrate pitfalls of ML model interpretation such as bad model generalization, dependent features, feature interactions or unjustified causal interpretations. Our paper addresses ML practitioners by raising awareness of pitfalls and pointing out solutions for correct model interpretation, as well as ML researchers by discussing open issues for further research.” - mulner et al.
    1.
    *** whitening a black box. This is very good, includes eli5, lime, shap, many others.
    3.
    ​Alibi-explain - White-box and black-box ML model explanation library. Alibi is an open source Python library aimed at machine learning model inspection and interpretation. The focus of the library is to provide high-quality implementations of black-box, white-box, local and global explanation methods for classification and regression models.
    1.
    ​Hands on explainable ai youtube, git​
    2.
    ​Explainable methods are not always consistent and do not agree with each other, this article has a make-sense explanation and flow for using shap and its many plots.
    1.
    Intro to shap and lime, part 1, part 2​
    2.
    Lime
      2.
      ​LIME to interpret models NLP and IMAGE, github- In the experiments in our research paper, we demonstrate that both machine learning experts and lay users greatly benefit from explanations similar to Figures 5 and 6 and are able to choose which models generalize better, improve models by changing them, and get crucial insights into the models' behavior.
    3.
    Anchor
      1.
      ​Anchor from the authors of Lime, - An anchor explanation is a rule that sufficiently β€œanchors” the prediction locally – such that changes to the rest of the feature values of the instance do not matter. In other words, for instances on which the anchor holds, the prediction is (almost) always the same.
    5.
    SHAP advanced
      2.
      ​What are shap values on kaggle - whatever you do start with this
      3.
      ​Shap values on kaggle #2 - continue with this
      4.
      How to calculate Shap values per class based on this graph
    1.
    Shap intro, part 2 with many algo examples and an explanation about the four plots.
    3.
    ​Trusting models​
    5.
    ​Keras-vis for cnns, 3 methods, activation maximization, saliency and class activation maps
    6.
    ​The notebook! Blog​
    7.
    ​More resources!​
    8.
    ​Visualizing the impact of feature attribution baseline - Path attribution methods are a gradient-based way of explaining deep models. These methods require choosing a hyperparameter known as the baseline input. What does this hyperparameter mean, and how important is it? In this article, we investigate these questions using image classification networks as a case study. We discuss several different ways to choose a baseline input and the assumptions that are implicit in each baseline. Although we focus here on path attribution methods, our discussion of baselines is closely connected with the concept of missingness in the feature space - a concept that is critical to interpretability research.
    9.
    WHAT IF TOOL - GOOGLE, notebook, walkthrough​
    10.
    ​Language interpretability tool (LIT) - The Language Interpretability Tool (LIT) is an open-source platform for visualization and understanding of NLP models.
    11.
    ​Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead - β€œtrying to \textit{explain} black box models, rather than creating models that are \textit{interpretable} in the first place, is likely to perpetuate bad practices and can potentially cause catastrophic harm to society. There is a way forward -- it is to design models that are inherently interpretable. This manuscript clarifies the chasm between explaining black boxes and using inherently interpretable models, outlines several key reasons why explainable black boxes should be avoided in high-stakes decisions, identifies challenges to interpretable machine learning, and provides several example applications where interpretable models could potentially replace black box models in criminal justice, healthcare, and computer vision.”
    13.
    14.
    ​

WHY WE SHOULDN’T TRUST MODELS

    1.
      1.
      Datasets need more love
      2.
      Datasets ablation and public beta
      3.
      Inter-prediction agreement
    2.
    Behavioral testing and CHECKLIST
      1.
      ​Blog, Youtube, paper, git​

BIAS

    1.
    arize.ai on model bias.

DEBIASING MODELS

    1.
    ​Adversarial removal of demographic features - β€œWe show that demographic information of authors is encoded in -- and can be recovered from -- the intermediate representations learned by text-based neural classifiers. The implication is that decisions of classifiers trained on textual data are not agnostic to -- and likely condition on -- demographic attributes. β€œ β€œwe explore several techniques to improve the effectiveness of the adversarial component. Our main conclusion is a cautionary one: do not rely on the adversarial training to achieve invariant representation to sensitive features.”
    2.
    ​Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection (paper) , github, presentation by Shauli et al. - removing biased information such as gender from an embedding space using nullspace projection. The objective is this: give a representation of text, for example BERT embeddings of many resumes/CVs, we want to achieve a state where a certain quality, for example a gender representation of the person who wrote this resume is not encoded in X. they used the light version definition for β€œnot encoded”, i.e., you cant predict the quality from the representation with a higher than random score, using a linear model. I.e., every linear model you will train, will not be able to predict the person’s gender out of the embedding space and will reach a 50% accuracy. This is done by an iterative process that includes. 1. Linear model training to predict the quality of the concept from the representation. 2. Performing β€˜projection to null space’ for the linear classifier, this is an acceptable linear algebra calculation that has a meaning of zeroing the representation from the projection on the separation place that the linear model is representing, making the model useless. I.e., it will always predict the zero vector. This is done iteratively on the neutralized output, i.e., in the second iteration we look for an alternative way to predict the gender out of X, until we reach 50% accuracy (or some other metric you want to measure) at this point we have neutralized all the linear directions in the embedding space, that were predictive to the gender of the author.
For a matrix W, the null space is a sub-space of all X such that WX=0, i.e., W maps X to the zero vector, this is a linear projection of the zero vector into a subspace. For example you can take a 3d vectors and calculate its projection on XY.
    1.
    Can we extinct predictive samples? Its an open question, Maybe we can use influence functions?
​Understanding Black-box Predictions via Influence Functions - How can we explain the predictions of a blackbox model? In this paper, we use influence functions β€” a classic technique from robust statistics β€” to trace a model’s prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction.
We show that even on non-convex and non-differentiable models where the theory breaks down, approximations to influence functions can still provide valuable information. On linear models and convolutional neural networks, we demonstrate that influence functions are useful for multiple purposes: understanding model behavior, debugging models, detecting dataset errors, and even creating visually indistinguishable training-set attacks.
    2.
    ​Bias detector by intuit - Based on first and last name/zip code the package analyzes the probability of the user belonging to different genders/races. Then, the model predictions per gender/race are compared using various bias metrics.
    3.
    ​
    4.
    ​

​

PRIVACY

    1.
    2.
    ​Fairness in AI​

DIFFERENTIAL PRIVACY

    1.
    ​Differential privacy has emerged as a major area of research in the effort to prevent the identification of individuals and private data. It is a mathematical definition for the privacy loss that results to individuals when their private information is used to create AI products. It works by injecting noise into a dataset, during a machine learning training process, or into the output of a machine learning model, without introducing significant adverse effects on data analysis or model performance. It achieves this by calibrating the noise level to the sensitivity of the algorithm. The result is a differentially private dataset or model that cannot be reverse engineered by an attacker, while still providing useful information. Uses BOTLON & EPSILON
    2.
    ​youtube​

ANONYMIZATION

DE-ANONYMIZATION

    1.
    GPT2 - Of language datasets ​
    ​
Last modified 1mo ago