A large dataset of scientific text reuse in open-access publications

HIGHLIGHTS

who: Lukas Gienapp from the Text Mining and Retrieval Group, Leipzig University have published the research work: A large dataset of scientific text reuse in Open-Access publications, in the Journal: (JOURNAL)
what: To achieve high scalability, the authors implement the similarity computation by applying locality-sensitive hash functions h to each passage t in a document d, thus representing each document`s passages with a set of hash values h(t). Only six of the approaches presented at PAN perform in sub-quadratic time, a necessary requirement for large-scale detection. This new contribution . . .

If you want to have access to all the content you need to log in!

Thanks :)

Username or Email

Password

Remember me

Lost your password?

If you don't have an account, you can create one here.

Add A Knowledge Base Question !