Well, the answer to that question really depends on how one defines duplicate.
Some people define duplicate content as a block of content that has already been indexed by search engines as detected on websites other than your own.
Others define it exclusively as blocks of content that search engines find on your site via various pathways.
The first definition of duplicate content, a block of text that can be found on both your site and another site, does not, contrary to some, harm your site in any way. Provided it is not stolen content, the two websites simply compete for SERP rank. No big deal.
The other type of duplicate content, in rare instances, may be misconstrued by search bots as blue hat SEO in which the webmaster is attempting to fool them. This type is bad for your site.
Probably one of the most simple instances to view what some search engines may consider to be duplicate content on your website are the two separate pathways that can be used to find an article or blog post on your site: one being the title above an excerpt on your index page and the other being the “Read More” link beneath an excerpt on your index page.
Both distinct pathways lead to the same indistinct content. The same can be said of pathways composed of categorized and/or tagged pathways.
In these cases, it is generally a good idea to rel=”canonical” all pathways to your content other than the one you want search engines to identify as the preferred path.
This is simply one way. Google has a variety of methods whereas one can do this process commonly referred to as Canonicalization.
This will help prevent any (albeit unlikely) bot confusion on this subject and will help keep your site healthy, indexed, and in good positioning in SERP results.
With respect to duplicate content shared between sites…this is a bit of a different scenario.
Generally, sites containing content already published on another site and that is already in the Google index are not penalized, per se. Rather, they simply do not rank in SERP results.
Detecting duplicate content for prior indexing is quite easy as all one has to do is copy and paste blocks of text from any article into the Google search bar and view the results.
Google has essentially stated that if you syndicate your content, i.e. allow for the publishing your content elsewhere, your original copy may or may not be the one shown in search results…
For these reasons, an author may decide to place or request the placement of a “Source:” link at the bottom of syndicated copies.
As a “general rule” typically, the first location that a unique block of text is detected by search bots typically has an advantage in terms of SERP placement. This does not necessarily always remain the case, but usually, the original will rank the highest.