TextRank's sentence similarity function

Report a typo

You have two sentences as a post-processed word list. Compute their similarity score, according to the original TextRank sentence similarity formula:

Sim(Si,Sj)={wkwkSi&wkSj}log(Si)+log(Sj)\text{Sim}(S_i, S_j) = \frac{|{\{ w_k} \vert {w_k} \in S_i \& w_k \in S_j \}|}{\log(|S_i|) + \log(|S_j|)}

A natural log\log is assumed. It is guaranteed that the input's lengths are more than 22. Return the value rounded up to four digits after the decimal point.

Sample Input 1:

['believ', 'opsydia', 'technolog', 'improv', 'exist', 'way', 'diamond', 'identifi']
['alreadi', 'system', 'track', 'diamond', 'first', 'emerg', 'mine', 'rough', 'diamond', 'cut', 'polish', 'phase', 'retail']

Sample Output 1:

0.2153
Write a program in Python 3
def textrank_similarity(s1:list, s2:list) -> float:
...
___

Create a free account to access the full topic