This has little effect on the findability of the document, because for that it doesn't matter much whether the same text is indexed more than once.
It does have however a severe effect on the scoring.
See e.g.
http://www.lucenetutorial.com/advanced-topics/scoring.html
It woudl be better to check whether the values of certain fields were indexed already for this document, and skip them otherwise.
Index which was used while finding this problem (from EO repository):
<list path="genres,programs,episodes,mediarel,mediafragments,mediasources" element="episodes" searchdirs="destination">
<mmsq:constraint field="mediafragments.status" value="3" />
<mmsq:constraint field="mediasources.format" operator="in" value="1,9,12" />
<mmsq:constraint field="episodes.title" operator="like" value="%job%" />
<mmsq:constraint field="episodes.body" operator="like" value="%job%" />
<mmsq:field name="genres.number" alias="genre" keyword="true" />
<mmsq:field name="programs.title" boost="5" />
<mmsq:field name="programs.mediaclasse" alias="mediaclasse" keyword="true" />
<mmsq:field name="episodes.title" boost="10" />
<mmsq:field name="episodes.subtitle" />
<mmsq:field name="episodes.intro" />
<mmsq:field name="episodes.body" />
<mmsq:field name="episodes.shorttext" />
<mmsq:field name="episodes.keywords" alias="keywords" boost="20" keyword="true" split="," />
</list>
Here the multilevel is made longer just to be able to add constraints there. But if an episode as many related mediatfragments, it will add the 'episodes' themselves very many times to the document, making it score very badly, because it is so big.
The problem occurs if you do something like this:
<list path="t_stream,t_metadata" searchdirs="destination" element="t_stream">
<mmsq:constraint field="t_metadata.enduser" value="learner" />
<mmsq:constraint field="t_metadata.schooltype" value="1" />
<mmsq:field name="title" boost="2" />
<mmsq:field name="subtitle" />
<mmsq:field name="intro" />
<mmsq:field name="body" />
<mmsq:relatednodes type="t_metadata">
<mmsq:field name="enduser" alias="enduser" store="true" keyword="false" />
<mmsq:field name="minfactor" alias="minfactor" store="true" />
<mmsq:field name="maxfactor" alias="maxfactor" store="true" />
<mmsq:relatednodes type="t_keyword">
<mmsq:field name="name" boost="4" />
</mmsq:relatednodes>
</mmsq:relatednodes>
</list>
In this case, enduser, minfactor, and maxfactor are not indexed because lucene presumes that t_metadata is already indexed.
It is possible to circumvent this by rewriting your queries thusly:
<list path="t_stream,t_metadata" searchdirs="destination" element="t_stream">
<mmsq:constraint field="t_metadata.enduser" value="learner" />
<mmsq:constraint field="t_metadata.schooltype" value="1" />
<mmsq:field name="title" boost="2" />
<mmsq:field name="subtitle" />
<mmsq:field name="intro" />
<mmsq:field name="body" />
<mmsq:field name="t_metadata.enduser" alias="enduser" store="true" keyword="false" />
<mmsq:field name="t_metadata.minfactor" alias="minfactor" store="true" />
<mmsq:field name="t_metadata.maxfactor" alias="maxfactor" store="true" />
<mmsq:relatednodes type="t_metadata">
<mmsq:relatednodes type="t_keyword">
<mmsq:field name="name" boost="4" />
<mmsq:field name="synonyms" />
<mmsq:field name="typos" />
</mmsq:relatednodes>
</mmsq:relatednodes>
</list>
But that is a bit silly.
At any rate, this is not backward compatible, so I prefer if this is rolled back (at least in 1.9) until a proper fix is made that does not cause these issues.