#45 Paris Women in Machine Learning and Data Science: dealing with unlabelled data, combinatorial optimisation with policy adaptation using latent space search, and ageism in AI

WiMLDS Paris
5 min readDec 9, 2023
Natalie, Shikha, Sophie, Caroline, Juliette, Sharone, Daria, Jihane, Caroline, Marie

Last week, on November 30, we hosted our final Meetup for 2023 (teaser: exciting plans are already underway for 2024). The event was an exceptional collaboration with the Women in Big Data association, with not three, but five outstanding speakers!

We extend our heartfelt gratitude to ContentSquare for generously hosting us in the heart of Paris.

Moreover, it was a special occasion as our new volunteer, Jihane, orchestrated her first Meetup session. Following the customary WiMLDS introduction and its objectives, she announced a significant milestone: the Paris chapter has now become the second largest chapter worldwide! Then she unveiled details about our forthcoming event scheduled at the newly inaugurated Institut Henri Poincare on February 10. WiMLDS will offer a visit for 30 members of the community.

Sharone then presented Women in Big Data, an engrossing association offering mentorship programs to female practitioners, fuelled by a community of over 18,000 members globally. For those eager to engage, membership registration is available here and and their LinkedIn page can be followed here.

Sharone Dayan

Our first 2 speakers, Sharone Dayan, Machine Learning Engineer and Daria Stefic, Data Scientist, both from Contentsquare, captivated the audience by delving into evaluation strategies for dealing with partially labelled or unlabelled data. They outlined three primary strategies accompanied by real-life use cases.

  • In the context of identifying users with purchase intent, Sharone emphasised the pivotal role of labeled data in training or evaluating classifiers. However, given that available labels stem solely from converting users, distinguishing non-buyers with purchase intent posed a significant challenge. Sharone shed light on their approach of using a proxy, considering a user who adds an item to their cart as one exhibiting purchase intent.
  • The second case study explored how Sharone and her team opted to generate artificial data to facilitate user segment discovery. Their aim was to understand user personas based on user’s page visit sequence. They crafted hypothetical scenarios employing four types of session generators to identify four distinct patterns and validate the clustering methodology.
  • Daria then presented the final case study, focusing on their product’s existing module — the alerting module. This module’s objective is to notify users of issues while ensuring it refrains from inundating users with alerts for minor variations. To validate the model’s accuracy in raising true alerts, they annotated anomalous versus non-anomalous events over a defined time period. Their findings revealed that annotating periods, rather than specific points, proved more effective, as anomalies typically span over five minutes or more, thus allowing for a more accurate quantification of model results.

If you want to know more, check out her slides below:

Sophie Monnier and Shikha Surana

The second presentation of the evening was delivered by Sophie Monnier and Shikha Surana, both professionals at InstaDeep, focusing on a paper recently submitted and accepted at NeurIPS 2023. The paper delves into the field of Combinatorial Optimisation with Policy Adaptation using latent Space Search, conveniently abbreviated as COMPASS.

Sophie commenced with a brief introduction to InstaDeep, highlighting that over 10% of their workforce comprises full-time researchers!

Shikha then presented the COMPASS framework. Their work targets combinatorial optimisation, aiming to resolve problems with a defined set of potential solutions, seeking the optimal solution within this set. The complexity arises when dealing with larger problem sizes, causing the set of potential solutions to grow exponentially. This is where reinforcement learning emerges as a game-changer.

COMPASS’s objective was to optimally resolve combinatorial optimisation problems while respecting a budget constraint. Unlike previous methods that combined pre-trained policies with additional search strategies (limited by computational expense and poor generalisation), COMPASS trains an infinitely large set of diverse policies by conditioning solely on a single set of policy parameters. It offers a framework for training and effectively searching a set of specialised policies during inference.

Remarkably, COMPASS achieved a state-of-the-art performance across 29 tasks, demonstrating exceptional generalisation capabilities, a notable stride in this field.

If you want to know more, check out her slides below:

Caroline Jean-Pierre

Then Caroline Jean-Pierre took the stage to shed light on a pressing issue within the tech industry: ageism. She became involved in this subject during her work on the Data for Good white paper (available here), which examines the risks associated with generative AI. Unlike other biases and forms of discrimination, ageism stands out as a bias that will affect every one of us, yet it receives far less attention in the media.

In her insightful presentation, Caroline delineated that ageism in artificial intelligence can manifest in 5 forms:

  • Technical: Age-related biases infiltrate algorithms and datasets utilised for training AI models, often leading to the exclusion or underrepresentation of certain age groups.
  • Personal: Within the tech industry, stereotypes and ideologies propagated by influential figures contribute to a lack of diversity among AI professionals. These prevailing biases further seep into the development and design of AI technologies.
  • Discourse: Discussions concerning AI often overlook the age category and the challenges faced by older individuals. Biases and discrimination predominantly focus on gender and ethnicity, neglecting the issues surrounding age.
  • Group: Automated decision-making tools employed in processes such as hiring can exhibit discriminatory behaviour against senior individuals, perpetuating age-based biases.
  • User: The absence of technology catering to the needs of older individuals from AI services and products can significantly marginalise them, denying them access or failing to address their requirements.

If you want to know more, check out her slides below:

Thank you Caroline for this insightful presentation!

Our next event is likely to take place at the end of January. Stay tuned for updates!

If you do not want to miss our events, you can:

🔗 follow us on Twitter, Meetup, and LinkedIn

📑 check our Google spreadsheet if you want to speak 📣, host 💙, help 🌠

📍join our Slack channel for information, discussions, and opportunities

📩 send an email to the Paris WiMLDS team to paris@wimlds.org

🎬 subscribe to our WiMLDS Paris Youtube channel

📸 follow the global WiMLDS on Instagram, LinkedIn, and Facebook

🔥 share your company or lab’s job offers for free on the global WiMLDS’ website.

--

--

WiMLDS Paris

WiMLDS Paris is a community of women interested in Machine Learning & Data Science