Computer scienceData scienceNLPMain NLP tasksText summarization

Extractive text summarization

TextRank's sentence similarity function

Report a typo

You have two sentences as a post-processed word list. Compute their similarity score, according to the original TextRank sentence similarity formula:

$\text{Sim}(S_i, S_j) = \frac{|{\{ w_k} \vert {w_k} \in S_i \& w_k \in S_j \}|}{\log(|S_i|) + \log(|S_j|)}$

A natural $\log$ is assumed. It is guaranteed that the input's lengths are more than $2$ . Return the value rounded up to four digits after the decimal point.

Sample Input 1:

['believ', 'opsydia', 'technolog', 'improv', 'exist', 'way', 'diamond', 'identifi']
['alreadi', 'system', 'track', 'diamond', 'first', 'emerg', 'mine', 'rough', 'diamond', 'cut', 'polish', 'phase', 'retail']

Sample Output 1:

0.2153

Write a program in Python 3

Code Editor
IDE100

def textrank_similarity(s1:list, s2:list) -> float:

...

___

Create a free account to access the full topic