• A Combined Unsupervised Technique for Automatic Classification in Electronic Discovery
  • Ayetiran, Eniafe Festus <1984>

Subject

  • IUS/20 Filosofia del diritto

Description

  • In this work, we present an automated unsupervised approach for retrieval/classification in eDiscovery. This approach is an ad hoc retrieval which creates a representative for each original document in the collection using latent dirichlet allocation (LDA) model with Gibbs sampling and explores word sense disambiguation (WSD) to give these representative documents and queries deeper meanings for distributional semantic similarity. The word sense disambiguation technique by itself is a hybrid algorithm derived from the modified version of the original Lesk algorithm and the Jiang & Conrath similarity measure. Evaluation was carried out on this technique using the TREC legal track. Results and observations are discussed in chapter 8. We conclude that WSD can improve ad hoc retrieval effectiveness. Finally, we suggest further on efficient algorithms for word sense disambiguation which can further improve retrieval effectiveness if applied to original document collections against using representative collections.

Date

  • 2017-01-31

Type

  • Doctoral Thesis
  • PeerReviewed

Format

  • application/pdf

Identifier

urn:nbn:it:unibo-19839

Ayetiran, Eniafe Festus (2017) A Combined Unsupervised Technique for Automatic Classification in Electronic Discovery, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Law, science and technology , 28 Ciclo. DOI 10.6092/unibo/amsdottorato/7789.

Relations