ASP.Net BBCode (C#)

Update

This code has now been superceeded by a better alternative that will allow you to use an off-the-shelf WYSIWYG and still allow custom tags.

This problem comes up if you find yourself creating a forum from scratch or implementing some sort of comment system and want to make sure you can introduce some formatting functionality without compromising security.

Well, there are plenty of regular expressions examples out there, but few deal directly with BBCode and of those, most don’t go beyond the basic Bold, Italic, and Strike formatting plus HTML links, images etc…

This example not only formats the above basic stuff, but also does quotes, alignment, Google search links, Wikipedia article links, as well embedded videos for several video sharing sites. You can always add more tags by following the same pattern. All the content is formatted into paragraphs (<p>) for proper validation. It checks for nested quotes up to a specified depth.

There is no extensive input cleanup to prevent XSS attacks through tags. I’m just showing the basics like tag stripping which can be circumvented by clever attakers, so it’s up to you to implement a more thorough system. The reason I’m excluding it is because there are already many, many, many examples out there that do a wonderful job at it.

This does set a limited set of formatting options.

To make everything easier to read in the code file, I used multiple Regex replacements instead of one super duper pattern. This also makes adding quick tags for something else much simpler.

A sample of the rendered markup :

  • [b]Bold[/b] = Bold
  • [i]Italic[/i] = Italic
  • [del]Strike[/de] = Strike
  • [color=blue]Blue[/blue] = Blue
  • [color=#FA9A99]Pinkish[/blue] = Pinkish
  • [size=2]Larger text[/size] (between 2 & 5) = Larger text
  • [google]once in a blue moon[/google] = once in a blue moon
  • [wikipedia]Captain Haddock[/wikipedia] = Captain Haddock (Remember, Wikipedia articles are case sensitive)
  • [img]http://www.google.com/logos/Logo_40blk.gif[/img] =
  • [img=Google Logo]http://www.google.com/logos/Logo_40blk.gif[/img] = Google Logo

One major difference than other BBCode functions is the ability to embed YouTube, Metacafe and LiveVideo media players. You only need to specify the URL within the tags.

E.G. For youtube :


For LiveVideo

I don’t think WordPress supports embedding Metacafe videos at the time of this post, but if you want to include those videos :
[metacafe]http://www.metacafe.com/watch/685732/the_diet/[/metacafe]

To find the exact URL of the LiveVideo page, click on “Get Codes” link underneath the video player.
I tried to keep this as simple as possible for users as they just need to wrap the url around [tag][/tag] markers.

You can embed quotes following a similar convention to phpBB. But there are slight differences as this was for a custom application.

[quote]This is a quote[/quote]

This is a quote

[quote=Author]This is a quote[/quote]

Author wrote


This is a quote

And so on…

This version also deals with Headers
([h#]Header[/h#] becomes <h#>Header</h#> from 1 to 6.)

[h3]Header[/h3]

[h4]Header[/h4]

And so on…

Of course, you would want to apply special formatting via CSS to keep the look of the page consistent with the rest of the site.

This particular excerpt was written for a .Net 3.5 app, but this portion should work on 2.0+ with no alterations since it doesn’t use anything unique to the newer framework.

This class is by no means meant to be a comprehensive bbcode plugin, but it should get you on the way to create your own custom tags.

Once again, this code has no usage restrictions. I’m just including a disclaimer like all other code samples I’ve posted here. You don’t have to ask me permission to use it for any purpose and I only ask that you abide by the disclaimer.

THE SOFTWARE IS PROVIDED “AS IS” AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

The tag function

/// <summary>
/// Converts the input plain-text BBCode to HTML output and replacing carriage returns
/// and spaces with <br /> and   etc...
/// Recommended: Use this function only during storage and updates.
/// Keep a seperate field in your database for HTML formatted content and raw text.
/// An optional third, plain text field, with no formatting info will make full text searching
/// more accurate.
/// E.G. BodyText(with BBCode for textarea/WYSIWYG), BodyPlain(plain text for searching),
/// BodyHtml(formatted HTML for output pages)
/// </summary>
public static string ConvertToHtml(string content)
{
    // Clean your content here... E.G.:
    // content = CleanText(content);

    // Basic tag stripping for this example (PLEASE EXTEND THIS!)
    content = StripTags(content);

    content = MatchReplace(@"\[b\]([^\]]+)\[\/b\]", "<strong>$1</strong>", content);
    content = MatchReplace(@"\[i\]([^\]]+)\[\/i\]", "<em>$1</em>", content);
    content = MatchReplace(@"\[u\]([^\]]+)\[\/u\]", "<span style=""text-decoration:underline"">$1</span>", content);
    content = MatchReplace(@"\[del\]([^\]]+)\[\/del\]", "<span style=""text-decoration:line-through"">$1</span>", content);

    // Colors and sizes
    content = MatchReplace(@"\[color=(#[0-9a-fA-F]{6}|[a-z-]+)]([^\]]+)\[\/color\]", "<span style=""color:$1;"">$2</span>", content);
    content = MatchReplace(@"\[size=([2-5])]([^\]]+)\[\/size\]", "<span style=""font-size:$1em; font-weight:normal;"">$2</span>", content);

    // Text alignment
    content = MatchReplace(@"\[left\]([^\]]+)\[\/left\]", "<span style=""text-align:left"">$1</span>", content);
    content = MatchReplace(@"\[right\]([^\]]+)\[\/right\]", "<span style=""text-align:right"">$1</span>", content);
    content = MatchReplace(@"\[center\]([^\]]+)\[\/center\]", "<span style=""text-align:center"">$1</span>", content);
    content = MatchReplace(@"\[justify\]([^\]]+)\[\/justify\]", "<span style=""text-align:justify"">$1</span>", content);

    // HTML Links
    content = MatchReplace(@"\[url\]([^\]]+)\[\/url\]", "<a href=""$1"">$1</a>", content);
    content = MatchReplace(@"\[url=([^\]]+)]([^\]]+)\[\/ur\l]", "<a href=""$1"">$2</a>", content);

    // Images
    content = MatchReplace(@"\[img\]([^\]]+)\[\/img\]", "<img src=""$1"" alt="""" />", content);
    content = MatchReplace(@"\[img=([^\]]+)]([^\]]+)\[\/img\]", "<img src=""$2"" alt=""$1"" />", content);

    // Lists
    content = MatchReplace(@"\[*\]([^\[]+)", "<li>$1</li>", content);
    content = MatchReplace(@"\[list\]([^\]]+)\[\/list\]", "<ul>$1</ul><p>", content);
    content = MatchReplace(@"\[list=1\]([^\]]+)\[\/list\]", "</p><ol>$1</ol><p>", content);

    // Headers
    content = MatchReplace(@"\[h1\]([^\]]+)\[\/h1\]", "<h1>$1</h1>", content);
    content = MatchReplace(@"\[h2\]([^\]]+)\[\/h2\]", "<h2>$1</h2>", content);
    content = MatchReplace(@"\[h3\]([^\]]+)\[\/h3\]", "<h3>$1</h3>", content);
    content = MatchReplace(@"\[h4\]([^\]]+)\[\/h4\]", "<h4>$1</h4>", content);
    content = MatchReplace(@"\[h5\]([^\]]+)\[\/h5\]", "<h5>$1</h5>", content);
    content = MatchReplace(@"\[h6\]([^\]]+)\[\/h6\]", "<h6>$1</h6>", content);

    // Horizontal rule
    content = MatchReplace(@"\[hr\]", "<hr />", content);

    // Set a maximum quote depth (In this case, hard coded to 3)
    for (int i = 1; i < 3; i++)
    {
        // Quotes
        content = MatchReplace(@"\[quote=([^\]]+)@([^\]]+)|([^\]]+)]([^\]]+)\[\/quote\]", "</p><div class=""block""><blockquote><cite>$1 <a href=""" + QuoteUrl("$3") + """>wrote</a> on $2</cite><hr /><p>$4</p></blockquote></div></p><p>", content);
        content = MatchReplace(@"\[quote=([^\]]+)@([^\]]+)]([^\]]+)\[\/quote\]", "</p><div class=""block""><blockquote><cite>$1 wrote on $2</cite><hr /><p>$3</p></blockquote></div><p>", content);
        content = MatchReplace(@"\[quote=([^\]]+)]([^\]]+)\[\/quote\]", "</p><div class=""block""><blockquote><cite>$1 wrote</cite><hr /><p>$2</p></blockquote></div><p>", content);
        content = MatchReplace(@"\[quote\]([^\]]+)\[\/quote\]", "</p><div class=""block""><blockquote><p>$1</p></blockquote></div><p>", content);
    }

    // The following markup is for embedded video -->

    // YouTube
    content = MatchReplace(@"\http:\/\/([a-zA-Z]+.)youtube.com\/watch\?v=([a-zA-Z0-9_\-]+)\[\/youtube\]",
        "<object width=""425"" height=""344""><param name=""movie"" value=""http://www.youtube.com/v/$2""></param><param name=""allowFullScreen"" value=""true""></param><embed src=""http://www.youtube.com/v/$2"" type=""application/x-shockwave-flash"" allowfullscreen=""true"" width=""425"" height=""344""></embed></object>", content);

    // LiveVideo
    content = MatchReplace(@"\http:\/\/([a-zA-Z]+.)livevideo.com\/video\/([a-zA-Z0-9_\-]+)\/([a-zA-Z0-9]+)\/([a-zA-Z0-9_\-]+).aspx\[\/livevideo\]",
        "<object width=""445"" height=""369""><embed src=""http://www.livevideo.com/flvplayer/embed/$3"" type=""application/x-shockwave-flash"" quality=""high"" width=""445"" height=""369"" wmode=""transparent""></embed></object>", content);

    // LiveVideo (There are two types of links for LV)
    content = MatchReplace(@"\http:\/\/([a-zA-Z]+.)livevideo.com\/video\/([a-zA-Z0-9]+)\/([a-zA-Z0-9_\-]+).aspx\[\/livevideo\]",
        "<object width=""445"" height=""369""><embed src=""http://www.livevideo.com/flvplayer/embed/$2&autostart=0"" type=""application/x-shockwave-flash"" quality=""high"" width=""445"" height=""369"" wmode=""transparent""></embed></object>", content);

    // Metacafe
    content = MatchReplace(@"\[metacafe\]http\:\/\/([a-zA-Z]+.)metacafe.com\/watch\/([0-9]+)\/([a-zA-Z0-9_]+)/\[\/metacafe\]",
        "<object width=""400"" height=""345""><embed src=""http://www.metacafe.com/fplayer/$2/$3.swf"" width=""400"" height=""345"" wmode=""transparent"" pluginspage=""http://www.macromedia.com/go/getflashplayer"" type=""application/x-shockwave-flash""></embed></object>", content);

    // LiveLeak
    content = MatchReplace(@"\[liveleak\]http:\/\/([a-zA-Z]+.)liveleak.com\/view\?i=([a-zA-Z0-9_]+)\[\/liveleak\]",
        "<object width=""450"" height=""370""><param name=""movie"" value=""http://www.liveleak.com/e/$2""></param><param name=""wmode"" value=""transparent""></param><embed src=""http://www.liveleak.com/e/59a_1231807882"" type=""application/x-shockwave-flash"" wmode=""transparent"" width=""450"" height=""370""></embed></object>", content);

    // < -- End video markup

    // Google and Wikipedia page links
    content = MatchReplace(@"\[google\]([^\]]+)\[\/google\]", "<a href=""http://www.google.com/search?q=$1"">$1", content);
    content = MatchReplace(@"\[wikipedia\]([^\]]+)\[\/wikipedia\]", "<a href=""http://www.wikipedia.org/wiki/$1"">$1</a>", content);

    // Put the content in a paragraph
    content = "</p><p>" + content + "</p>";

    // Clean up a few potential markup problems
    content = content.Replace("t", "    ")
        .Replace("  ", "  ")
        .Replace("<br />", "")
        .Replace("<p><br />", "</p><p>")
        .Replace("</p><p><blockquote>", "<blockquote>")
        .Replace("</blockquote></blockquote></p>", "")
        .Replace("<p></p>", "")
        .Replace("<p><ul></ul></p>", "<ul>")
        .Replace("<p></p></ul>", "")
        .Replace("<p><ol></ol></p>", "<ol>")
        .Replace("<p></p></ol>", "")
        .Replace("<p><li>", "</li><li><p>")
        .Replace("</p></li></p>", "");

    return content;
}

StripTags and Match Replace functions

/// <summary>
/// Strip any existing HTML tags
/// </summary>
///
<param name="content">Raw input from user</param>
/// <returns>Tag stripped storage safe text</returns>
public static string StripTags(string content)
{
	return MatchReplace(@"< [^>]+>", "", content, true, true, true);
}

public static string MatchReplace(string pattern, string match, string content)
{
	return MatchReplace(pattern, match, content, false, false, false);
}

public static string MatchReplace(string pattern, string match, string content, bool multi)
{
	return MatchReplace(pattern, match, content, multi, false, false);
}

public static string MatchReplace(string pattern, string match, string content, bool multi, bool white)
{
	return MatchReplace(pattern, match, content, multi, white);
}

/// <summary>
/// Match and replace a specific pattern with formatted text
/// </summary>
///
<param name="pattern">Regular expression pattern</param>
///
<param name="match">Match replacement</param>
///
<param name="content">Text to format</param>
///
<param name="multi">Multiline text (optional)</param>
///
<param name="white">Ignore white space (optional)</param>
/// <returns>HTML Formatted from the original BBCode</returns>
public static string MatchReplace(string pattern, string match, string content, bool multi, bool white, bool cult)
{
	if (multi && white && cult)
		return Regex.Replace(content, pattern, match, RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);
	else if (multi && white)
		return Regex.Replace(content, pattern, match, RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.IgnoreCase);
	else if (multi && cult)
		return Regex.Replace(content, pattern, match, RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.CultureInvariant);
	else if (white && cult)
		return Regex.Replace(content, pattern, match, RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace | RegexOptions.CultureInvariant);
	else if (multi)
		return Regex.Replace(content, pattern, match, RegexOptions.IgnoreCase | RegexOptions.Multiline);
	else if (white)
		return Regex.Replace(content, pattern, match, RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);
	else if (cult)
		return Regex.Replace(content, pattern, match, RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);

	// Default
	return Regex.Replace(content, pattern, match, RegexOptions.IgnoreCase);
}

Enjoy!

Addendum…
Robert Beal, in particular has created a wonderful HtmlUtility class for C# 3.0+ that will only allow certain tags and tag attributes. If your visitors make use of extensive HTML tags, then that is a better option than my system. If you want to implement a feedback system for guest writers with strong HTML support, Robert’s example is highly recommended.

My example is really only for people who post in plain text most of the time, would only post formatted text and videos semi-rarely.

In fact, it’s probably best to avoid extensive HTML support for ordinary comments, as that will only encourage users to abuse the system. You’re better off using this for something minimal.

About these ads

22 thoughts on “ASP.Net BBCode (C#)

  1. Why thank you!

    You may need to modify a few things here and there as the C# Regex patterns may slightly differ from PHP ones.

    Also, that example by Robert Beal might come in handy. It’s all C#, but you may get some ideas from that as well.

  2. Cheers for the link back. My class was built with tinymce in mind.

    It still needs some work around making it properly white list (ie encode everything then match what we believe to be “good”). And a few other bits and pieces. I’ve not had a chance to go back to it just yet.

    I’ve got about 20 unit tests for it too, to check it protects against various XSS attacks. I’ll try and update it next week and include them.

    • Hi chere,

      Unfortunately this code will only work in C# not PHP.
      phpBB as the name suggests, is written in PHP.

      But you can implement custom BBCodes without using this code.
      Just ask around the phpBB support section and they can help you with everything you need.

  3. No matter what I do, I can’t get this code to copy and compile cleanly. VS2005 has problems with the quotes in some of the strings. I’m hoping that you have some suggestions. :)

    • I’m really sorry about that. Try this version.

      Unfortunately, this is because the WordPress source code formatter introduces a lot of artifacts into it. It adds extra spaces, takes spaces out, turns ‘<' into '&lt;'. And it does some other strange things like removing whole elements like backslashes etc…

      And I’ll add a warning to the intro on this blog as well.

      edit_
      I just edited this post again with a fresh copy paste. Hopefully this time people should have fewer problems.

    • Hi Ravenheart,

      Sorry about the confusion.

      QuoteUrl is just a small helper function to turn the reference ID (post ID) into a link referring back to that post.

      In this case :

      public static string QuoteUrl(string ID)
      {
      return “/getpost.aspx?id=”+ID;
      }

      Where “getpost.aspx” would be a page getting the exact post directly instead of listing the whole thread. This is a feature you would commonly find in certain forums.

      I didn’t include it with this code because the quoted URL page really depends on how your application is setup. You may have a completely different URL structure.

      Hope this clears it up.

    • This was written a while ago as part of a much older project, so I’ll have to track it down again.

      Meanwhile, you should look into Robert Beal’s example on the bottom of the post. It may give you a better example in the longer run.

    • Hi Orium,

      Yes, this class has quite a few flaws since it relies on regular expressions. There are lots of scenarios where the code will fail to recognize a valid pattern, which is why I have posted a better alternative on the very top of the post.

  4. Wonderful blog! Do you have any tips and hints for aspiring writers?
    I’m planning to start my own site soon but I’m a little lost on everything.
    Would you advise starting with a free platform like WordPress or go
    for a paid option? There are so many options out there that I’m totally overwhelmed ..
    Any tips? Kudos!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s