A Critical Field Guide for Working with Machine Learning Datasets

resource thumbnail

Remove from Bookmarks

Do you really want to remove?
This action cannot be undone. Choose 'Cancel' to stop and go back.
Ratings: 0
  • Which text to add here??

Added by Martin on 2023-02-17 10:44

» Viewed 191 times
» Favorited by 0 user(s)
» 0 Comments
» This resource has public visibility

Holder of Rights:

License: unknown

Creator(s): Sarah Ciston

Description:
Machine learning datasets are powerful but unwieldy. They are often far too large to check all the data manually, to look for inaccurate labels, dehumanizing images, or other widespread issues. Despite the fact that datasets commonly contain problematic material — whether from a technical, legal, or ethical perspective — datasets are also valuable resources when handled carefully and critically. This guide offers questions, suggestions, strategies, and resources to help people work with existing machine learning datasets at every phase of their lifecycle. Equipped with this understanding, researchers and developers will be more capable of avoiding the problems unique to datasets. They will also be able to construct more reliable, robust solutions, or even explore promising new ways of thinking with machine learning datasets that are more critical and conscientious.

Add to Collection

You don't have any collections yet. Click here to create your first collection!

Share to Group

You don't have any group you can share this resource with: the resource is already shared to all groups you are member in. Click here to see available groups!

Create QR Code

Please select the URI for the QR Code:




Tags

sort: alphabeticallyby frequency
use blanks to separate tags

Comments

A Critical Field Guide for Working with Machine Learning Datasets Machine learning datasets are powerful but unwieldy. They are often far too large to check all the data manually, to look for inaccurate labels, dehumanizing images, or other widespread issues. Despite the fact that datasets commonly contain problematic material — whether from a technical, legal, or ethical perspective — datasets are also valuable resources when handled carefully and critically. This guide offers questions, suggestions, strategies, and resources to help people work with existing machine learning datasets at every phase of their lifecycle. Equipped with this understanding, researchers and developers will be more capable of avoiding the problems unique to datasets. They will also be able to construct more reliable, robust solutions, or even explore promising new ways of thinking with machine learning datasets that are more critical and conscientious. Sarah Ciston