Porting Word2MediaWikiPlus to VB.NET: Part 12 (initialization continued…)

[Previous articles in this series: Prologue, Part 1, Part 2, Part 3, Part 4, Part 5, Part 6, Part 7, Part 8, Part 9, Part 10, Part 11.]

MW_Initialize()

Much of this function seems to repeat the actions taken in Word2MediaWikiPlus(), so it’s a bit weird to see it done here as well (since this function is called explicitly by the other).  While some of it can be immediately discarded, other bits have to be examined more closely – mostly because they’re poorly documented (at least at the point from which they’re being called):

  • Again we have an ImagePath enumeration and/or creation
    • There’s an interesting new function I haven’t seen before: IIf(x,y,z)
      • VBA For Dummies tells me it does “test for ‘x’; if true, do ‘y’; if false, do ‘z'”.  Fairly tidy little function there.
    • Looking deeper into what’s going on here, the macro is assigning the ImagePath setting to a folder named “wiki” under the user’s My Pictures folder
    • This doesn’t make a lot of sense for a folder of temporary files that are deleted at the end of the session (or before the beginning of the next)
    • Therefore I’m going to make two changes:
      • this folder will be created as a subfolder of the user’s %TEMP% location
      • this folder will not only be emptied at the beginning of a session, but (as a good citizen of the computer) it will also empty its contents once it has completed a conversion
  • Again we have an EditorPath enumeration
    • it appears that the only path being set is the Microsoft Photo Editor (which we’ve previously confirmed is no longer available)
    • Is there any way to actually perform the image manipulation to which so much code has been devoted?

The more I look at this image extraction code, the more complicated it gets.  At this point I’ve pretty much determined that, for all the effort it’ll cost to implement these image features, it’s just not worth the trouble in v1.  I’ll continue to add TODO: comments to the VSTO add-in to show where the image code will eventually go, but I’m not going to do any further work to understand the image code until the rest of the Add-in is working.

Finally, there are the control characters that are being assigned (^l, ^m, ^p, ^s).  They’re not documented in the code, and I’m having a hard time finding any documentation that discusses the use of these control characters.  It doesn’t help that Google and MSDN Search don’t seem to allow you to search on “^p” — it seems they treat this as either “p” or “<p".

I believe I could treat these as global constants in the Convert class, but what isn’t clear is whether these control characters are:

  1. special substitutions in Word, and will get converted to the native Word paragraph/new line/blank/page break code (in which case I should just use the native VSTO/VBA enumerations), or
  2. treated by Word as ASCII text and sent to the Wiki server, which converts them to HTML when displaying the resulting article (in which case I should probably make sure there isn’t a better way to represent these in MediaWiki format).

Aha!  After trying over & over, I finally came up with a search in Microsoft’s Knowledge Base that gave me an article talking about the “^p” (which it calls a “paragraph mark”):

WD2000: Text Converted to One-Row Table (Paragraph Marks Ignored)

These appear to be ancient character sequences (as early as Word 1.0), so I’m going to first try using the native Word enumerations for these character strings wherever possible.  If I have to go back to using these character sequences, then I’ll drop them back in to the Convert class as Constants.

Aside: today I stumbled on an invaluable reference: the Microsoft Word Visual Basic Reference online.  This implies it’s an authoritative reference for all VBA available in Microsoft Word.  Should prove useful.

MediaWikiConvert_Prepare

From what I can tell by a single read through this routine’s code, this all appears to affect the ActiveDocument.  That means all this code can go into the InitializeActiveDocument() subroutine (which I’ve conveniently already defined).

  • MW_SetOptions_2003() is just caching the Application.Options.SmartParaSelection value and then returning it once conversion is complete.  This can be handled as with the other cached settings.
  • I don’t understand this code fragment at all:
        'Now, if we might have some problems, if we are in a table
        pg.Range.Select
        If Selection.Information(wdWithInTable) Then Selection.SplitTable

  • If a variable like convertPageHeaders was always False (as I can’t find anything that sets it True), then why would such a huge block of code be hidden inside this code block:
    If GetReg("convertPageHeaders")... EndIf
    It's just hard to guess what the programmer's intentions were with a never (rarely?) called piece of code.
  • Then there’s a lot of boring code conversion, where I’m just giving methods and variables more meaningful names, adding appropriate prefixes to all the Word enums being used, and just commenting the crap out of things where I don’t have a clue how to fix some weird or cryptic code routines.

Reference to a non-shared member requires an object reference

The most interesting thing I’ve had to research so far was the problem I created for myself by implementing the code into two classes (so far).  I finally got around to calling the Convert class’ public methods in the ThisAddin class’ uiConvert_Click() handler.  As the naive little programmer that I am, I of course first tried to just set the Imports statement at the top of the ThisAddin class, and then call the public methods “naked” like so:

        InitializeActiveDocument()

        InitializeConversion()

        PerformConversion()

Of course that didn’t work, but I didn’t know why at the time.  Instead, I scratched my head for quite a while over how to handle the compiler warning “Error 232: Reference to a non-shared member requires an object reference“.

I’ve run up against this before, and I’m pretty sure I was lured at the time down the path to hell: I started adding Shared declarations all over the place.  It’s really tempting — when the IDE implies you should try an easy fix like this, it’s hard to know why this should be bad.  “Didn’t the IDE’s developers know what they were doing?”  “Why would they lead morons like me astray?

Unfortunately, this is akin to tugging at that first loose strand of a nice wool sweater: pretty soon I’d added so many additional Shared declarations that I’m sure the code was wide open to all sorts of future, stealthy issues I have no idea about.

This time around, once I saw that one Shared begat yet another implied request to add another Shared declaration, I stopped and did some further digging around.  While I wasn’t able to find any articles or MSDN docs that really spelled it out for me, I think I figured out a worthy approach on my own.  [This forum thread was as good as any.]

I’ve published the following as Community Content to the “Error 232” page on MSDN.

Avoid adding the Shared keyword

While this error message tempts the inexperienced programmer with the “easy” solution of just adding the Shared keyword to the requested Method, I advise strongly against it.  Unfortunately there’s little documentation or advice out there aimed at the programmers like me who don’t really understand the problems they’ve created, nor the trade-offs in the possible solutions being (cryptically) recommended.  Hopefully this’ll help out other folks like myself avoid the really nasty mistake I’ve already made a few times.

The trouble with adding the Shared keyword to a second Class’ Method is that it rarely stops there.  Once you’ve shared a method, whether Public, Private or otherwise, many of that method’s members will also need adjustments.  At least in my experience, the first Shared keyword will work as well as cutting off the Hydra’s head: it usually leads to one or more instances of the error “Error 227: Cannot refer to an instance member of a class from within a shared method or shared member initializer without an explicit instance of the class.”  The first time I tried to kill this Hydra, I had tried to rewrite a bunch of code, and ended up with a rat’s nest of Shared keywords scattered everywhere.

A Better Approach than Adding the Shared Keyword

As the advice on this page (cryptically) recommends, try creating an instance of the class.  The big fear that initially scared me off was that I’d end up either (a) unknowingly creating and destroying tons of unnecessary instances of that Class as objects, or (b) not understanding when the object I’d created fell out of scope (and would creep up on me with unpredictable garbage collection-derived errors).

What I did to alleviate this issue was to declare a “class-level” variable in the calling class of the type of the class being called, and then use that variable as the root of all subsequent uses of the called class’ methods.

This example should illustrate:

Public Class BusinessLogic   ' This is the "called" class
    Public Sub PerformAction()
        Action()
    End Sub
    Private Sub Action()
            ...
    End Sub
End Class

Public Class UserInterface   ' This is the "calling" class
  Imports BusinessLogic  ' Doesn't help with Error 232, and may not be necessary at all
    Dim documentLogic As New BusinessLogic ' class-level variable 
    Private Sub uiButton_Click(ByVal Ctrl As Microsoft.Office.Core.CommandBarButton, ByRef CancelDefault As Boolean) Handles uiButton.Click
        PerformAction()  ' Causes Error 232
        documentLogic.PerformAction() ' This call is OK
    End Sub
    ...
End Class

Y’know, sometimes I’m just documenting this stuff for myself, since I know that in a few weeks’ time I’ll have completely forgotten the solution and the logic behind it.  The rest of you happen to be benefiting from my lack of memory, and I wish I could say I was being completely selfless, but I’m getting too old to be lying to folks I never even met. 🙂

Advertisements

Five Ways to Use Visual Studio to Avoid Secure Coding Mistakes

I was talking with a colleague recently, and we got on the subject of static analysis and why we all have to suffer with the problem of first making the mistakes in code, and then fixing them later.  She challenged me to come up with some ways that we could avoid the mistakes in the first place, and here’s what I told her:

  1. IntelliSense — the Visual Studio IDE is pretty smart about providing as-you-type hints and recommendations on all sorts of common coding flaws (or at least, it catches me on a lot of the mistakes that I frequently make), and they’re enabled out of the box (at least for Visual Basic.NET — I can’t recall if that’s true for C# as well).  [But I wonder why IntelliSense doesn’t handle some of the basic code maintenance?]
  2. Code snippets — Visual Studio has a very handy feature that allows you to browse a self-describing tree of small chunks of code, that are meant to accomplish very specific purposes.  These snippets save lots of time on repetitive or rarely-used routines, and reduce the likelihood of introducing errors in similar hand-coded blocks of code.
  3. PInvoke.net — if you ever need to P/Invoke to Win32 APIs (aka unmanaged code), this free Visual Studio add-on gives you as definitive a library as exists of recommended code constructs for doing this right.
  4. Code Analysis (cf. FxCop) — this is a bit of a cheat, as these technologies at first are simply about scanning your code (MSIL in fact) to identify flaws in your code (including a wide array of security-related flaws).  However, with the very practical tips they provide on how to resolve the coding flaw, this quickly becomes a teaching tool to reinforce better coding behaviours so you (and I) can avoid making those mistakes again in the future.
  5. Community resources — F1 is truly this coder’s best friend.  Banging on the F1 key in Visual Studio brings up a multi-tabbed search UI that gives you access not only to local and online versions of MSDN Library, but also to two collections that I personally rely on heavily: the CodeZone community (a group of MS-friendly code-junkie web sites with articles, samples and discussions) and the MSDN Forums (Microsoft’s dazzling array of online Forums for discussing every possible aspect of developing for the Microsoft platform).  If there’s one complaint I have about the MSDN Forums, it’s that there so freakin’ many of them, it’s very easy to end up posting your question to the wrong Forum, only to have the right one pointed out to you later (sometimes in very curt, exasperated, “why do these morons keep showing up?” form).

However, if like me you’re not satisfied with just the default capabilities of Visual Studio, then try out some of these add-ons to enhance your productivity:

There are a large number of third-party code snippets available from http://www.gotcodesnippets.net as well (though the quality of these is totally unverified, and should be approached with caution).

 

  • Code Analysis (FxCop):
    • JSL FxCop — a coding tool that eases the difficulty of developing custom rules, as well as a growing library of additional rules that weren’t shipped by Microsoft.
    • Detecting and Correcting Managed Code Defects — MSDN Team System walkthrough articles for the Code Analysis features of Visual Studio.

I’m also working on trying to figure out how to add a set of custom sites to the Community search selections (e.g. to add various internal Intel web sites as targets for search).