Why does SharePoint put in character code 8203 in a richtext field?
I use some RichHtmlField controls (PublishingWebControls) in different pagelayouts. I edit the pages, put some text in the fields and publish. It all seems to work fine, but I've noticed that SharePoint saves an extra character to my string. Usually it's added at the beginning of the string, but sometimes at the end.

You cannot see it using the ordinary browser window, because it's a zero width space character. But if you right click and select View source, it's visible as a big space. 

It's possible to copy the text from the view source window to a text editor preserving this character, so I pasted it as a string in a c# program. When I loop through every character in the string to check its ascii value, this particular character shows as 8203.

I used CAML Builder to check what my string looked like in the database. I couln't see anything strange, but when I copied the string from the CAML Builder result tab and pasted it into a hex editor, you could clearly see that the strange character was there.

The problem is that we translate our pages to different languages and this character makes the translation engine go bananas.

Has anyone experienced this before or has any idea how this could be solved?
June 10th, 2013 1:23pm

Hi ,

I am trying to involve someone familiar with this topic to further look at this issue.

Thanks

Free Windows Admin Tool Kit Click here and download it now
June 12th, 2013 9:47am

Thank you.

Please let me know if you need more information or the HTML being rendered

June 12th, 2013 9:18pm

try to use this Free Rich Editor 

http://ckeditor.com/blog/CKEditor-for-SharePoint-Ultimate-Editing-Solution

might be it solves your Problem :) 

2nd are you Copying content in side RichHtmlField somewhere like internet or word; 

Free Windows Admin Tool Kit Click here and download it now
June 13th, 2013 3:58am

Please share piece  of  html >? which have uni characters 
June 13th, 2013 3:59am

The problem appears in all my richtext fields. It doesn't matter if I copy/paste text into these fields or if I write a simple text manually in the fields.

I tried to put a string containing the character in a code block above, but the character was lost during this process. This is what happens when I paste the string directly in this editor:

<

p>This is my lead text for my news item.</p>


When I check the ascii codes for this string (as it appears in my HTML page), I get this:

60        (which is <)
112      (which is p)
62        (which is >)
8203    (here it is)
84     (which is T) and so on...


Free Windows Admin Tool Kit Click here and download it now
June 13th, 2013 8:40am

The problem appears in all my richtext fields. It doesn't matter if I copy/paste text into these fields or if I write a simple text manually in the fields.

I tried to put a string containing the character in a code block above, but the character was lost during this process. This is what happens when I paste the string directly in this editor:

<

p>This is my lead text for my news item.</p>


When I check the ascii codes for this string (as it appears in my HTML page), I get this:

60        (which is <)
112      (which is p)
62        (which is >)
8203    (here it is)
84     (which is T) and so on...


June 13th, 2013 8:40am

Hello,

The Ascii code 8203 stands for line break : http://www.fileformat.info/info/unicode/char/200b/index.htm

commonly abbreviated ZWSP ;  this character is intended for invisible word separation and for line break control; it has no width, but its presence between two characters does not prevent increased letter spacing in justification

A third party editor may be adding them , if they are not originally present in your script/code.

Are you using any such editor tools?

It will be helpful to open a troubleshooting ticket with Microsoft so that we can look at the issue in more depth.

Free Windows Admin Tool Kit Click here and download it now
July 25th, 2013 5:43pm

I am running into this same problem.  In our case, we have a search results web part inside a rich html zone.  The web part displays pictures from a picture library.  For whatever reason, Sharepoint is adding 21 of these &#8203; unicode characters to the very top of the rich html zone (before the web part).  Aside from the web part, the rich thml zone does not have any content so I don't think it coming from a 3rd party text editor.

This wouldn't be an issue except that these characters create an ugly and unwanted margin above the web part.

October 3rd, 2013 10:49pm

Nice to see I'm not the only one with this problem. I didn't get a solution to this. When I'm using my custom webparts, I can always filter in code behind by doing this:

description = description.Replace(((char)8203).ToString(), "");

Unfortenately this isn't possible when you're putting the rich text fields directly on your page layout.

Free Windows Admin Tool Kit Click here and download it now
October 4th, 2013 8:05am

We tried running some javascript on the masterpage to get rid of them. It worked, but that ended up breaking a jquery plugin we're using in a custom display template.  This might work for you though.

The plan now is to not use rich html zones(or at least don't put webparts in them if we can avoid it) and just use webpart zones instead.

October 4th, 2013 2:29pm

FWIW - I know this is months later, but I just ran into the same problem.  I found that by concatenating my code before inserting via the editor, that prevented the zero width spaces (8203's).  I run a find and replace in SublimeText2 to replace the regex of \n with nothing.
Free Windows Admin Tool Kit Click here and download it now
February 20th, 2014 10:55pm

I wrote this little gem which is a bit DOM-intensive but it does the job...

function spCleanup(code){

if(code.children().length > 0){

code.html(jQuery('<div>').append(code.children().clone()).html().replace(/&nbsp;|&#160;|\r\n|\n|\r|\t/g,'').replace(/\s{2,}/g,' '));

       code.children().each(function(){

       

spCleanup(jQuery(this));

       });

};

};

spCleanup(jQuery('#s4-bodyContainer'));

March 12th, 2014 2:49pm

Actually, I just simplified my fix down to this...

jQuery('#s4-bodyContainer').html(jQuery('#s4-bodyContainer').html().replace(/\u200B/g,''));

Free Windows Admin Tool Kit Click here and download it now
March 12th, 2014 5:25pm

I had the same issue and resolved it by little trick with my browsers "developer tools"

  1. Go to edit page
  2. Turn on "Developer tools"
  3. Inspect unwanted characters
  4. Right click > delete node/element

Save your page and changes will be saved also ;)




March 28th, 2014 9:36am

I had the same issue and resolved it by little trick with my browsers "developer tools"

  1. Go to edit page
  2. Turn on "Developer tools"
  3. Inspect unwanted characters
  4. Right click > delete node/element

Save your page and changes will be saved also ;)




Free Windows Admin Tool Kit Click here and download it now
March 28th, 2014 9:36am

webninjataylor, I can 't thank you enough for this jquery solution you put together.  Thank you!!! this answer came at the right time!
July 21st, 2014 11:59am

You're welcome.  :)

BTW, I recently found out the script solution doesn't play well with some Bootstrap scenarios.  For me, I've seen parts of the DOM removed.  :(

Free Windows Admin Tool Kit Click here and download it now
July 21st, 2014 1:53pm

I'm discovering right now that you just have to be specific in where you want the clean up process to occur.  If you wrap your custom content areas (whether if its reusable content item, or a content section on your page), use "that" id selector instead of #s4-bodyContainer, to remove the characters.  I don't think SharePoint places nice when we try to remove these characters from it's parent level id selectors...

 
  • Edited by blackhawx Monday, July 21, 2014 2:29 PM
July 21st, 2014 2:29pm

I'm discovering right now that you just have to be specific in where you want the clean up process to occur.  If you wrap your custom content areas (whether if its reusable content item, or a content section on your page), use "that" id selector instead of #s4-bodyContainer, to remove the characters.  I don't think SharePoint places nice when we try to remove these characters from it's parent level id selectors...

 
  • Edited by blackhawx Monday, July 21, 2014 2:29 PM
Free Windows Admin Tool Kit Click here and download it now
July 21st, 2014 2:29pm

This is helping me out so far with the olso template, this way I can simply load all the ID selectors I really care about...and clean them up!

function removecharacters() {
/*REMOVE THE 8203 CHARACTER FROM DOM*/
var obj = {
  "resuable1": "features",
  "resuable2": "slider-support"
};
$.each( obj, function(key, val) {
$('#' + val).html(jQuery('#' + val).html().replace(/\u200B/g,''));
});
/**/
}

/*CALL ALL JQUERY FUNCTIONS*/
$(window).load(function() {
  removecharacters();
});





  • Edited by blackhawx Monday, July 21, 2014 2:46 PM
July 21st, 2014 2:42pm

This is helping me out so far with the olso template, this way I can simply load all the ID selectors I really care about...and clean them up!

function removecharacters() {
/*REMOVE THE 8203 CHARACTER FROM DOM*/
var obj = {
  "resuable1": "features",
  "resuable2": "slider-support"
};
$.each( obj, function(key, val) {
$('#' + val).html(jQuery('#' + val).html().replace(/\u200B/g,''));
});
/**/
}

/*CALL ALL JQUERY FUNCTIONS*/
$(window).load(function() {
  removecharacters();
});





  • Edited by blackhawx Monday, July 21, 2014 2:46 PM
Free Windows Admin Tool Kit Click here and download it now
July 21st, 2014 2:42pm

I had the same issue and resolved it by little trick with my browsers "developer tools"

  1. Go to edit page
  2. Turn on "Developer tools"
  3. Inspect unwanted characters
  4. Right click > delete node/element

Save your page and changes will be saved also ;)




This is a good plan too, thanks for sharing this.
July 21st, 2014 3:02pm

Excellent, blackhawx.  I'll give that a shot when I need it next.  :)
Free Windows Admin Tool Kit Click here and download it now
July 21st, 2014 3:05pm

Cool.  Thanks!  :)
July 21st, 2014 3:08pm

Nice and simple solution. Thanks.
Free Windows Admin Tool Kit Click here and download it now
April 3rd, 2015 10:45am

All of the solutions proposed here are wonderful, but the elephant is still in the room. Why does the RichHtmlField behave like this? Please don't tell me that this behavior is intended? I'm supposed to tell my site content editors that they can't create clean HTML inside a RichHtmlField in source code view and count on it still being there after the file is checked in?? That's ridiculous. 

In source code view, I created a simple DIV containing a series of stacked images, and went so far as to remove ALL line breaks and spaces between everything - all the HTML was on the same line. Upon check in, SharePoint inserts these characters no matter what.

It's great that we have both server and client side work-arounds, but seriously, shouldn't it simply work properly in the first place? There are dozens of FREE WYSIWYG editors that work beautifully. But the (expensive) out of the box SharePoint RichHtmlField doesn't. WHY?

June 30th, 2015 7:43pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics