- Notifications
You must be signed in to change notification settings - Fork 1.9k
Bringing Fairlearn - GridSearch to ML.NET #6279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Added `inPlace:true` to the append to fix the bug that no columns are added; Added unit testing to test the basic funcitonality of Metrc.Regression.ByGroup()
Added MSE and MAE (MeanAbsoluteError), and included tests for MSE and RMS
RMS and MSE fully supported by Fairlearn.Metric.Regression now accross all three functions
created the moment class and utilityParity class. Initial commit
Passed the initial unit tests for Demographic Parity. Every class is made public which needs to be changed in the future. Utility Parity still needs to be changed for other parities to work
…ificationSearchSpaceGenerator update
…lculating the signed weights
There is still a lot of development needed to be done with the moment class
…search getting sensitive feature column names directly from a getter function, adjusted default value for the option to a random value
built the first prototype for the gridSearchTrialRunner. We have enabled a seperate training set for training and testing set for validating the result, which is a different approach from the original implementation.
Added a new feature so that users can look up row item by the name of the column instead of the raw row index
…ridsearch" This reverts commit a9f07a9.
… fairlearn In fairlearn AutoML, we have to add in a singleton to the serviceCollection called moment, which will be later extracted to calculate the fairness parity.
Added an if statement to output fairlearn metric if using fairlearn
…nsion in fairlearn" This reverts commit 328f718.
the experiment is able to add a moment to its serviceCollection which is later used to calculate fairlearn parity.
Created a tuner so that we can go through the search space through the gridsearch algorithm
| } | ||
| //TODO: what should be the object type of X be? How can I make x capitilized to fit the whole data strcuture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space between these.
| /// overall mean utility | ||
| /// | ||
| /// </summary> | ||
| public class UtilityParity : ClassificationMoment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this class going to be used by the users? If not we should keep them internal. In general, if the user doesn't need to directly interact with it we should keep things internal.
| using Microsoft.Data.Analysis; | ||
| | ||
| | ||
| namespace Microsoft.ML.Fairlearn.reductions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reductions namespace should start with caps.
| /// <param name="x">The features</param> | ||
| /// <param name="y">The label</param> | ||
| /// <param name="sensitiveFeature">The sensitive groups</param> | ||
| public new void LoadData(IDataView x, DataFrameColumn y, StringDataFrameColumn sensitiveFeature)//, StringDataFrameColumn events = null, StringDataFrameColumn utilities = null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you still need the comments at the end?
| } | ||
| /// <summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space between here.
| } | ||
| } | ||
| | ||
| public class DemographicParity : UtilityParity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need an empty class?
| } | ||
| private DataFrame CreateDummyDataset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spacing.
| | ||
| namespace Microsoft.ML.Fairlearn.Tests | ||
| { | ||
| public class GridSearchTest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should extend BaseTestClass
| // Data generated so it is identical from Binary_Classification.ipynb from Fairlearn.github on Github | ||
| private DataFrame CreateGridScearhDataset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spacing
| } | ||
| /// <summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spacing.
| return df; | ||
| } | ||
| /// <summary> | ||
| /// This trial runner run the tests from Grid searh for Binary Classification.ipynb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dont need summary comments for test methods.
| // Licensed to the .NET Foundation under one or more agreements. | ||
| // The .NET Foundation licenses this file to you under the MIT license. | ||
| // See the LICENSE file in the project root for more information. | ||
| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extra space.
| using Microsoft.ML.Data; | ||
| using Xunit; | ||
| | ||
| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extra space.
| | ||
| namespace Microsoft.ML.Fairlearn.Tests | ||
| { | ||
| public class MetricTest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extend BaseTestClass
| } | ||
| public class HouseData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spacing.
| IDataView data; | ||
| public MetricTest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spacing.
| | ||
| namespace Microsoft.ML.Fairlearn.Tests | ||
| { | ||
| public class UtilityTest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spacing.
Should extend BaseTestClass
| } | ||
| [Fact] | ||
| public void DemographyParityTest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spacing.
| int[] vs = { 1, 1, 1, 1, 1, 1, 1, 0, 0, 0 }; | ||
| PrimitiveDataFrameColumn<int> y = new PrimitiveDataFrameColumn<int>("label", vs); | ||
| | ||
| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extra space.
michaelgsharp left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly minor formatting fixes.
| This pull request has been automatically marked |
| @jordi1215 could you please address the feedback and resolve the conflict in this PR? Thanks! |
| This pull request has been automatically marked |
| This pull request will now be closed since it had been marked |
Related to: #1912
Related to: Issue 1091 on the Fairlearn Repo
As part of my summer internship project, I have created several classes that will provide a foundation for the further inclusion of Fairlearn algorithms into ML.NET.
For the first PR, I have included an end-to-end working solution for the gridSearch method in Fairlearn which leverages AutoML for parameter searching. The example is included in the Unit test.
We are excited to review your PR.
So we can do the best job, please check:
Fixes #nnnnin your description to cause GitHub to automatically close the issue(s) when your PR is merged.