<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Project on Constantin Orasan&#39;s website</title>
    <link>https://dinel.org.uk/categories/project/</link>
    <description>Recent content in Project on Constantin Orasan&#39;s website</description>
    <generator>Hugo</generator>
    <language>en</language>
    <copyright>© 2026 CONSTANTIN ORASAN&lt;br&gt;Build:20260405</copyright>
    <lastBuildDate>Sat, 10 Feb 2024 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://dinel.org.uk/categories/project/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Harnessing court data using NLP and spoken language technology</title>
      <link>https://dinel.org.uk/research/projects/harnessingNLP4court/</link>
      <pubDate>Sat, 10 Feb 2024 00:00:00 +0000</pubDate>
      <guid>https://dinel.org.uk/research/projects/harnessingNLP4court/</guid>
      <description>&lt;p&gt;Combining innovations in speech-to-text (STT) technology and Natural Language Processing (NLP), this &lt;a href=&#34;https://gtr.ukri.org/projects?ref=10022430&#34; target=&#34;_blank&#34;&gt;InnovateUK funded&lt;/a&gt; collaborative project between University of Surrey and &lt;a href=&#34;https://just-access.org/&#34; target=&#34;_blank&#34;&gt;JUST:access&lt;/a&gt; developed an automated transcription tool designed specifically for the justice sector. The developed solution also provides an easy way to navigate the lengthy court hearings with the help of the final judgement. The project had two main phases:&lt;/p&gt;&#xA;&lt;ol&gt;&#xA;&lt;li&gt;Improving the Automatic Speech Recognition (ASR): we used NLP to automatically identify important phrases from the legal domain. This information, together with a large dataset of court hearings, was used to fine tune a generic off the shelf ASR system. Evaluation revealed not only a reduction of the word error rate, but also better recognition of legal terminology and legal entities. The results are presented in our AI4AJ 2023 paper.&lt;/li&gt;&#xA;&lt;li&gt;Method for linking court hearings with the final judgement: we employed Generative AI technology to identify timespans in court hearings which are relevant to the paragraphs from the court judgement. This enables better access to justice by allowing fast navigation of very long court hearings. In order to train our method, I designed an annotation tool specifically for this purpose. The proposed linking method was presented at EMNLP 2023.&lt;/li&gt;&#xA;&lt;/ol&gt;&#xA;&lt;h3 id=&#34;surrey-project-team&#34;&gt;Surrey project team&lt;/h3&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;Constantin Orasan, principal investigator&lt;/li&gt;&#xA;&lt;li&gt;Hadeel Saadany, research fellow&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h3 id=&#34;publications&#34;&gt;Publications&lt;/h3&gt;&#xA;&lt;bibtex src=&#34;https://dinel.org.uk/bibs/harnessing.bib&#34;&gt;&lt;/bibtex&gt;&#xA;&#xA;&lt;script type=&#34;text/javascript&#34;&gt;&#xA;&#x9;var is_bibtex_rendered = false;&#xA;&#xA;&#x9;function bibtex_rendered() {&#xA;&#x9;&#x9;is_bibtex_rendered = true;&#x9;&#x9;&#xA;&#x9;}&#xA;&lt;/script&gt;&#xA;   &#xA;&#xA;&#xA;&#xA;&#xA;&#x9;&lt;div class=&#34;bibtex_structure&#34;&gt;&#xA;&#x9;  &lt;div class=&#34;sort year&#34; extra=&#34;DESC number&#34;&gt;&#xA;&#x9;      &lt;div class=&#34;templates&#34;&gt;&lt;/div&gt;      &#xA;&#x9;  &lt;/div&gt;&#xA;&#x9;&lt;/div&gt;&#xA;&#xA;&#xA;&#xA;&lt;div id=&#34;bibtex_display&#34; callback=&#34;bibtex_rendered()&#34;&gt;&#xA;  &#xA;  &lt;div class=&#34;bibtex_template&#34; style=&#34;display: none;&#34;&gt;&#xA;    &lt;ul class=&#34;list-publications&#34;&gt; &lt;li&gt;&#xA;      &lt;span class=&#34;if !author&#34;&gt;&#xA;      &#x9;&lt;span class=&#34;if editor&#34;&gt;&#xA;      &#x9;&#x9;&lt;span class=&#34;editor&#34;&gt;&lt;/span&gt;&#xA;      &#x9;&lt;/span&gt;&#xA;      &lt;/span&gt;&#xA;      &#x9;&lt;span class=&#34;author&#34;&gt;&lt;/span&gt;&#xA;&#xA;      (&lt;span class=&#34;if year&#34;&gt;&lt;span class=&#34;year&#34;&gt;&lt;/span&gt;&lt;/span&gt;)&#xA;      &lt;span class=&#34;if title&#34;&gt;&#xA;      &#x9;&lt;span class=&#34;if url&#34;&gt;&lt;a class=&#34;url&#34;&gt;&lt;span class=&#34;title&#34;&gt;&lt;/span&gt;&lt;/a&gt;.&lt;/span&gt;&#xA;      &#x9;&lt;span class=&#34;if !url&#34;&gt;&lt;span class=&#34;title&#34;&gt;&lt;/span&gt;.&lt;/span&gt;&#xA;      &lt;/span&gt;&#xA;&#xA;&#xA;      &lt;span class=&#34;if journal&#34;&gt;&lt;em&gt;&lt;span class=&#34;journal&#34;&gt;&lt;/span&gt;&lt;/em&gt;,&lt;/span&gt;&#xA;      &lt;span class=&#34;if booktitle&#34;&gt;In &lt;em&gt;&lt;span class=&#34;booktitle&#34;&gt;&lt;/span&gt;&lt;/em&gt;,&lt;/span&gt;&#xA;      &lt;span class=&#34;if publisher&#34;&gt;&lt;span class=&#34;publisher&#34;&gt;&lt;/span&gt;,&lt;/span&gt;&#xA;      &lt;span class=&#34;if volume&#34;&gt;&lt;span class=&#34;volume&#34;&gt;&lt;/span&gt;&lt;/span&gt;&#xA;      &lt;span class=&#34;if number&#34;&gt;(&lt;span class=&#34;number&#34;&gt;&lt;/span&gt;)&lt;/span&gt;&#xA;      &lt;span class=&#34;if address&#34;&gt;&lt;span class=&#34;address&#34;&gt;&lt;/span&gt;,&lt;/span&gt;&#xA;      &lt;span class=&#34;if pages&#34;&gt;pp. &lt;span class=&#34;pages&#34;&gt;&lt;/span&gt;&lt;/span&gt;&#xA;      &lt;div&gt;&#xA;&#x9;      &lt;span class=&#34;if doi&#34;&gt;[doi: &#xA;  &#x9;    &#x9;&lt;a class=&#34;bibtexVar&#34; href=&#34;https://doi.org/+DOI+&#34; extra=&#34;doi&#34; target=&#34;_blank&#34;&gt;&lt;span class=&#34;doi&#34;&gt;&lt;/span&gt;&lt;/a&gt;]&#xA;    &#x9;  &lt;/span&gt;&#xA;&#xA;      &#x9;&lt;span class=&#34;if abstract&#34;&gt;&#xA;        &#x9;&lt;a class=&#34;abstract-box-button&#34; role=&#34;button&#34;&gt;[abstract]&lt;/a&gt;&#xA;        &#x9;&lt;a class=&#34;bibtex-box-button&#34;  role=&#34;button&#34;&gt;[bib]&lt;/a&gt;&#xA;        &#x9;&lt;div class=&#34;abstract-box&#34;&gt;&lt;div&gt;&lt;strong&gt;Abstract&lt;/strong&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&#34;abstract&#34;&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&#xA;        &#x9;&lt;div class=&#34;bibtex-box&#34;&gt;&lt;div&gt;&lt;strong&gt;BibTeX entry&lt;/strong&gt;&lt;/div&gt;&lt;pre&gt;&lt;span class=&#34;bibtexraw noread&#34;&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;&#xA;      &#x9;&lt;/span&gt;&#xA;&#xA;      &#x9;&lt;span class=&#34;if !abstract&#34;&gt;&#xA;        &#x9;&lt;a class=&#34;bibtex-box-button&#34;  role=&#34;button&#34;&gt;[bib]&lt;/a&gt;&#xA;        &#x9;&lt;div class=&#34;bibtex-box&#34;&gt;&lt;div&gt;&lt;strong&gt;BibTeX entry&lt;/strong&gt;&lt;/div&gt;&lt;pre&gt;&lt;span class=&#34;bibtexraw noread&#34;&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;&#xA;      &#x9;&lt;/span&gt;&#xA;      &lt;/div&gt;&#xA;&#x9;  &lt;div style=&#34;display:none&#34;&gt;&lt;span class=&#34;bibtextype&#34;&gt;&lt;/span&gt;&lt;/div&gt;&#xA;    &lt;/li&gt;&lt;/ul&gt;&#xA;  &lt;/div&gt;&#xA;  &#xA;&lt;/div&gt;&#xA; &#xA;&#xA;&lt;h3 id=&#34;resources&#34;&gt;Resources&lt;/h3&gt;&#xA;&lt;h4 id=&#34;project-poster&#34;&gt;Project poster&lt;/h4&gt;&#xA;&lt;div class=&#34;vspace-2em gallery-image&#34;&gt;&#xD;&#xA;&lt;iframe src=&#34;https://www.slideshare.net/slideshow/embed_code/key/2uG1re4C0SFjGx?startSlide=1&#34; width=&#34;597&#34; height=&#34;486&#34; frameborder=&#34;0&#34; marginwidth=&#34;0&#34; marginheight=&#34;0&#34; scrolling=&#34;no&#34; style=&#34;border:1px solid #CCC; border-width:1px; margin-bottom:5px;max-width: 100%;&#34; allowfullscreen&gt;&lt;/iframe&gt;&lt;div style=&#34;margin-bottom:5px&#34;&gt;&lt;strong&gt;&lt;a href=&#34;https://www.slideshare.net/slideshows/harnessing-court-data-using-nlp-and-spoken-language-technology/266241377&#34; title=&#34;Harnessing Court Data using NLP and Spoken Language Technology&#34; target=&#34;_blank&#34;&gt;Harnessing Court Data using NLP and Spoken Language Technology&lt;/a&gt;&lt;/strong&gt; from &lt;strong&gt;&lt;a href=&#34;https://www.slideshare.net/dinel&#34; target=&#34;_blank&#34;&gt;Constantin Orasan&lt;/a&gt;&lt;/strong&gt;&lt;/div&gt;&#xD;&#xA;&lt;/div&gt;&#xD;&#xA;&lt;h4 id=&#34;annotation-tool&#34;&gt;Annotation tool&lt;/h4&gt;&#xA;&lt;p&gt;In order to train and evaluate our linking method, I developed a tool that allows users to easily annotate whether a timespan from a court hearing is linked to a paragraph from a court judgement. The tool allows to play the video/speech of the court hearing in order to facilitate the annotation. The tool is written in python and uses &lt;a href=&#34;https://www.djangoproject.com/&#34; target=&#34;_blank&#34;&gt;Django web framework&lt;/a&gt;. Below is a screenshot of the annotation tool. If you would like to use it, please get in touch. The tool is open source, but I need to tidy up the code a bit and add documentation in order to make it useful for others.&lt;/p&gt;</description>
    </item>
    <item>
      <title>The MESSAGE project</title>
      <link>https://dinel.org.uk/research/projects/MESSAGE/</link>
      <pubDate>Sat, 03 Feb 2024 00:00:00 +0000</pubDate>
      <guid>https://dinel.org.uk/research/projects/MESSAGE/</guid>
      <description>&lt;p&gt;EU funded project which delivered controlled languages standards for messages, alerts and protocols arising from terrorism and other security related risks in order to ensure correct transmission of understanding and reliable translation where necessary. The project run between Jan 2008 – Aug 2009. The project page can (still) be accessed at &lt;a href=&#34;http://message-project.univ-fcomte.fr/&#34; target=&#34;_blank&#34;&gt;&lt;a href=&#34;http://message-project.univ-fcomte.fr/&#34;&gt;http://message-project.univ-fcomte.fr/&lt;/a&gt;&lt;/a&gt;. The project was coordinated by Université de Franche-Comté, France. In this project, I was the principal investigator for the University of Wolverhampton and  Irina Temnikova was the researcher appointed on the project.&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
