Porting Word2MediaWikiPlus to VB.NET: Part 11 (The Return)

[Previous articles in this series: Prologue, Part 1, Part 2, Part 3, Part 4, Part 5, Part 6, Part 7, Part 8, Part 9, Part 10.]

After a few weeks’ hiatus to work on some other projects (including a couple of releases of CacheMyWork — now with more filtering!), I decided to come back to the W2MWPP effort.  And my overwhelming feeling right now is: thank god I was blogging as I coded!  As it is, it took me probably a full hour to figure out where to focus my attention next:

  • while the Configuration dialog still isn’t finished, I’m going to set that aside for now — it should be pretty easy to figure out what I need to add, especially once I know better which functionality is or isn’t required.
  • I’m going to start into the fundamentals of the code that’ll be called when the “Convert to Wiki” button is clicked.  This code will be called by ThisAddIn.uiConvert_Click().
  • The code in question will be derived from the VBA project’s modWord2MediaWikiPlus.bas file, from the Public Sub Word2MediaWikiPlus() routine.

Sub Word2MediaWikiPlus(): Overview

There are many types of constructs in this library, many of which I’m sure will be called as part of the base Word2MediaWikiPlus() routine.  However, the routine itself is only about 150 lines of code, so in itself the routine shouldn’t be too difficult to implement.

However, it’s probably a good idea to familiarize myself with some of the constructs in this library:

  • There are three unmanaged code Functions — two for MessageBox construction, and one called SendMessage.  I’m sure I’ll understand what they’re supposed to be for once I see them in context, and likely I can use some very simple managed code Methods in their place.
  • There are Constants both private and public.  [Thankfully, most of these seem to have been reasonably well documented.]
    • Some, like WMPVersion, will be taken care of by Visual Studio.
    • Others, like WikiOpenPage, should be implemented as configuration settings not code constants.
    • The majority, such as NewParagraphWithBR, will likely be preserved as stylistic choices whose usage choices I may not personally agree with, but likely are things that have been requested over the years and which if I dropped them, would probably piss off a whole bunch of folks who’ve grown to know and love the VBA macro.
  • There are plenty of Variables as well of course, and while the usage of the majority will likely become self-evident, there are a few (e.g. Word97) that I can probably safely drop.
  • There are some constructs labelled as Types as well here — I’m not sure, but they resemble the use of Structures in managed VB, and I’ll likely see them in heavy use.

At this point, I’m going to implement these constructs as needed.  I’d rather not add all these up front, ’cause (1) it’ll be pretty confusing for me, and (2) there’s likely some code that is no longer needed (or even has been forgotten).

Sub Word2MediaWikiPlus(): Initialize ActiveDocument

Even before generating the first set of routines, I figured I should do what I could to setup the Class Library with a little forethought:

  • Added a new Class to the Word2MediaWiki++ project called “Convert.vb”
  • Wanted to add this class to a global W2MWPP namespace (see Part 4 for details)
  • I once tried to add a namespace designation to a project after I’d already started coding it, and it was a disaster — references broke everywhere, and I’m not sure I ever got it cleaned up
  • Not knowing enough about namespace and class naming, I flipped through a couple of the books I had on hand and determined that this should be trivial to do up front
  • I wrapped the Public Class…End Class statements with Namespace Word2MediaWikiPlusPlus…End Namespace statements, and proceeded onwards

The first sixty lines or so of code in Word2MediaWikiPlus() all have to do with acquiring a handle to an active document.  In the world of Document add-ins, I presume this can be very tricky, since the code itself starts from a single document’s context and has to navigate outwards from there.  However, when we’re working with VSTO Application add-ins, it seems fairly easy to me to get access to an active document.

In fact, for the purposes of this tool, I’m going to assume that the user wants to convert whichever Word document is the active document.  The only error condition I should need to check is that there is at least one document object open.  This leads to the following code:

            Dim App as Word.Application = Globals.ThisAddIn.Application
            Dim Doc as Word.Document = App.ActiveDocument

            If Doc Is Nothing
                'When there's no active document open just return back to Word
                Exit Sub
            End If

The only code left in the document initialization block is this:

    DocInfo.DocName = ActiveDocument.Name
    DocInfo.DocNameNoExt = DocInfo.DocName
    p = InStrRev(DocInfo.DocName, ".")
    If p > 0 Then DocInfo.DocNameNoExt = Left$(DocInfo.DocName, p - 1)

I was baffled by this, so a quick trip to MSDN Library cleared it up.  The macro appears to be trying to parse out the document Name without the file extension.  DocInfo itself is an instance of the DocInfoType Type, which is just a structure to store various properties of the document.  There’s no particular reason I can see so far to use a structure for the DocName & DocNameNoExt properties, and the other properties to me don’t seem particularly related to the document itself.  At this point, I’ll assume DocInfo isn’t needed.  [Certainly my searches of the modWord2MediaWikiPlus.bas source only found two references to the DocInfo.DocNameNoExt property: once to assist the MW_GetImagePath() function, and once to set DocInfo.Articlename.  Both should be doable without this Structure, since .NET provides a Path.GetFileNameWithoutExtension function.]

However, this leads to the MW_Initialize() function, which I would guess also should be part of the Convert class’ initialization code.  I’ll check that out soon.

The last of the code in this section is the call to MW_LanguageTexts().  This is a localization macro, that will set a series of Registry values and Msg_* variables depending on a language setting retrieved from the Registry.  All this can be managed quite well using Resource files, so there’s no need to mess with setting all this via code at the moment.

I’m interested in knowing whether all the Registry settings being used by this VBA project are for localization, so let’s enumerate them all…

Registry value Purpose
ClickChartText localization
EditorKeyLoadPic localization
EditorKeyPastePicAsNew localization
EditorKeySavePic localization
EditorPaletteKey localization
txt_Footnote localization
txt_PageFooter localization
txt_PageHeader localization
txt_TitlePage localization
UnableToConvertMarker localization
WikiCategoryKeyWord localization
WikiSearchTitle localization
WikiUploadTitle localization

Wow, what a…symphony of Registry settings here, only a handful of which are directly used for localization.  There’s quite a number of them used for various Image manipulation operations, and others for various application settings etc.  Generally, from what I can tell, there’s no reason these too can’t be stored in the project’s app.config file.

There’s the Msg_* variables as well – are all these devoted to localization as well?

Variable name Purpose
Msg_CloseAll localization
Msg_Finished localization
Msg_LoadDocument localization
Msg_NoDocumentLoaded localization
Msg_Upload_Info localization

Yes, that’s all of them.

Further Config and Init

The final bit of code I’ll deal with today is the “user dialog on config settings”.  It appears that this calls some additional initialization code:

  1. performs customization if never performed
    • opens the frmW2MWP_Config form
    • sets ImageExtractionPE Registry setting
    • sets the variables EditorPath, OptionHtml, OptionPhotoEditor, OptionPhotoEditor.Enabled
    • sets language settings
    • sets any configured customization settings from the Config form (aka the VSTO UI Config dialog)
  2. sets Article name
  3. sets convertImagesOnly = False
    • this variable is used in Word2MediaWikiPlus() to determine whether to convert the whole document or only the embedded images.
    • Since this is apparently never set to True, I’ll remove the references.
  4. checks Word version and sets some control characters
    • It doesn’t appear that we’ll need this, as VSTO only seems to support Word 2003 and above
  5. sets Image Path
  6. sets the image editor path
  7. caches the value of a couple of Word’s Tools, Options settings
    • I’ve commented in these commands in case they’re necessary later
    • However, I assume there are better ways of preserving Word’s initial configuration than caching before changes and then flushing after changes
    • That is, it should be possible to make per-session changes to such settings, without having to make sure that Word’s settings are explictly reconfigured after the session has completed


That’s enough for one sitting, eh?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s