Abstrakti
The increasing influence of artificial intelligence (AI), the availability of textual data, and large language models (LLMs) over the past decade is evident in the growth of scholarly work on identifying skills from job advertisements. In this work, we examine the detection of sentences that express skills as well as the
explainability of model decisions with respect to their dependence on skill related tokens. We compare traditional machine learning (ML) approaches with a pretrained multilingual model and domain-adapted models for the task of English skill identification, and we assess the role of skill tokens in the classification process. We also investigate the ability of these models to generalize from English (EN) to Danish (DA) in both few-shot and zero-shot settings. Our findings indicate that both models achieve high performance in sentence classification achieving an F1-score of 94% for EN and overall accuracy between 93%–94% for both EN and DA. The results show that traditional ML methods can remain relevant under certain circumstances reinforcing the importance of realistic baselines in the context of skill identification.
explainability of model decisions with respect to their dependence on skill related tokens. We compare traditional machine learning (ML) approaches with a pretrained multilingual model and domain-adapted models for the task of English skill identification, and we assess the role of skill tokens in the classification process. We also investigate the ability of these models to generalize from English (EN) to Danish (DA) in both few-shot and zero-shot settings. Our findings indicate that both models achieve high performance in sentence classification achieving an F1-score of 94% for EN and overall accuracy between 93%–94% for both EN and DA. The results show that traditional ML methods can remain relevant under certain circumstances reinforcing the importance of realistic baselines in the context of skill identification.
| Alkuperäiskieli | Englanti |
|---|---|
| Otsikko | Proceedings of the 8th International Conference on Natural Language and Speech Processing (ICNLSP-2025) |
| Julkaisupaikka | Odense, Denmark |
| Kustantaja | Association for Computational Linguistics |
| Sivut | 410-415 |
| ISBN (elektroninen) | 979-8-89176-297-8 |
| Tila | Julkaistu - 2025 |
| OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisuussa |
| Tapahtuma | International Conference on Natural Language and Speech Processing - Odense, Denmark Kesto: 25 elok. 2025 → 27 elok. 2025 Konferenssinumero: 8th https://www.icnlsp.org/ |
Konferenssi
| Konferenssi | International Conference on Natural Language and Speech Processing |
|---|---|
| Lyhennettä | ICNLSP |
| Maa/Alue | Denmark |
| Kaupunki | Odense |
| Ajanjakso | 25/08/25 → 27/08/25 |
| www-osoite |
YK:n kestävän kehityksen tavoitteet
Tämä tuotos edistää seuraavia kestävän kehityksen tavoitteita:
-
SDG 4 – Laadukas koulutus
-
SDG 8 – Ihmisarvoinen työ ja taloudellinen kasvu
Sormenjälki
Sukella tutkimusaiheisiin 'Cross-Lingual Sentence-Level Skill Identification in English and Danish Job Advertisements'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.Viittausmuodot
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver