Categories: bane or boon

Categories: bane or boon, a forum discussion on Jojo CMS. Join us for more discussions on Categories: bane or boon on our Plugin Announcements forum.

Back to Forum Index : Back to Plugin Announcements Page [1] [2] >>  RSS
tom

Developer

tom

24 Jun 2010
Posts: 379

I'm committing a major re-hash of the articles plugin to address a bunch of issues:

1. overhead - the current code involves a vast number of database calls, and a lot of table joins on non-indexed fields.

2. simplicity - at the moment if you don't have categories enabled and you add a new articles page, you have to know that you need to enable them, make a new category, match the url, and assign articles to that category. It should just work automatically. likewise with rss.

3. flexibility - at the moment with all article options being global there is no simple way to reuse a plugin like articles to produce pages with smaller indexes, different layout or whatever

4. broken stuff - a bunch of things don't quite work (mainly rss) in either multilanguage and/or category sites

5. options overload

So, this is an attempt to fix all of that in one go. I've tested it reasonably extensively and it **should** just work, but there are bound to be some things broken, and it will require setup to be run.

On setup it will check all articles pages have a corresponding category and assign all articles to categories.

It uses the pageid for the category as the primary field so that left joins on the page table are using an indexed field rather than doing a full table scan.

The general idea is that the category side of categories becomes transparent and as far as the user is concerned, 'edit article categories' can become 'article page options'

It adds some new options to the categories so that if you need a superindex (for all articles on the site) or a parent index (for all articles in that category and any in pages under that one), you have that as an option.

Most of the other options don't do anything yet, but the idea is to move as many non-global options out of they options table and into categories as possible.

RSS feed pages are now generated on the fly from any articles index page url with /rss/ appended. RSS categories respect the page language of multi-language sites to only show feeds from that language section, the encoding has been adjusted so that foreign language feeds display correctly, and i've added some xml validation.
If you need a feed for all-articles (and you have more than one category), just make a new articles page (put it in Not On Menu if you don't want it showing) and set the corresponding category to 'All Articles'

and i've changed the default for auto-metadescription - although it may generate metadescriptions that are longer than Google's limit, these are also used by sites like Facebook to generate their share/link snippets and the current format is not particularly attractive or useful for that purpose.

One thing that is broken (assuming it worked before) is the allowance for slashes in the page url for articles pages (eg 'blogs/tom') - hopefully Mike can help me with getting this functioning again, but for the moment you're stuck with having 'blogs' (set with the parent option) and child pages with plain urls (like 'blogs-tom' or 'tom').

Another is i don't think it will automatically generate a unique url for a new articles page if the user doesn't assign one (in which case it won't work) - it should.

Also, this a kitchen sink commit with a fair amount of bloat in the code (although I've removed quite a lot too). The plan is, once this is bedded in properly, to start pulling re-usable code out into separate plugins - primarily rss feeds and comments - so that any plugin can call them as necessary without having to reproduce everything across every plugin, like the way the tags plugin works now.

When all is complete and if everyone is ok with all of this, my plan is to start converting some of the other plugins (like profiles, gallery3 events etc) to the same model.

feedback/issues from testing would be most welcome.
Rick Rick

24 Jun 2010
Posts: 336

Sounds absolutely brilliant! I definitely like your idea of shifting more options over to the categories table and especially optimising the whole lot for faster and fewer database queries.

In regards to "blogs/tom" not working, I submitted a fix for this in the 1.0 branch (rev 2997). This fixes the problem when working with the old articles plugin and any based off it... does it help with your revamped articles plugin?

Is it possible to move the category syncing to the core? Maybe use fielddata and tabledata and do it automatically?

I like your idea of moving comments out to a separate plugin, very nice.
tom

Developer

tom

24 Jun 2010
Posts: 379

the blogs/tom fix was, as far as i could tell, generating a lot of overhead in extra database calls and by the time i'd worked through everything else, my head hurt too much to work out how to get it back in without them - if you can have a look at it as it stands now and see if it can go back in that'd be great, otherwise hopefully Mike can have a look at it (he wrote the regex/prefix code), or i'll get onto it when i recover.

It just needs an extra bit stuck into isArticleUrl as far as i can tell to get it to recognise that blogs/tom is a prefix in its own right rather than a call to blogs[prefix]/tom[articleurl] (which then sets pageid to the blogs page id and 404s because it can't find an article with the url 'tom') when you have a blogs page as well as blogs/tom

I'll look at putting the sync into core when this code is a bit more stable/tested

Moving options will be gradual - most of our sites have custom templates, so i don't want to remove the global options until people have had a chance to update all of their custom templates to use the category options instead.

Another thing i want to look at is standardising the function names so that getArticles -> getItems, getArticleUrl -> getUrl, isArticleUrl -> isUrl and so on, and use that same format across as many plugins as possible. Partly to make reusing plugin code more straightforward, but also to allow for more standardised calls when one plugin wants to get content from another (like in theme/global).
Rick Rick

24 Jun 2010
Posts: 336

I fiddled round with the isArticleUrl() function too but discovered that even if that behaves properly, Jojo was still choosing the wrong page from the database. I even tried overriding that page from within the plugin but with minimal luck.

I'll try to think of a better way.
tom

Developer

tom

24 Jun 2010
Posts: 379

oh, one thing i haven't added - the setup script checks to see if any new fields have been added, but it doesn't check for indexes, so if you have existing installs, you should add an index manually for ac_pageid in the articlecategory table through phpmyadmin.

and i haven't tested the comment subscription code to any great depth..

..and, at some point, i want to look at adding an option to switch handling for sites that have very large numbers of articles (like elsewhere) where I suspect the current method of pulling out the whole articles table in one hit is less efficient than doing separate database calls for index, current, previous and next articles (although I need to test this theory a bit more to find out the cut off point)
searchmaster

25 Jun 2010
Posts: 19

are you able to get the google analytics parameters working for articles
ie http://www.searchmasters.co.nz/articles/i-am-found/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed

The parameters work for core pages, but not for articles and other plugins.
tom

Developer

tom

25 Jun 2010
Posts: 379

I'll have a look - i presume what's required is to allow the query string on the end through rather than redirecting to the cleaned version in getCorrectUrl ?
mikec

Lead Developer

mikec

25 Jun 2010
Posts: 67

Maybe add a filter to getCorrectUrl that allows other plugins to modify the url returned by the plugin.

Then create a plugin with a filter that adds this on, so then each plugin doesn't need to know about this stuff needing to be on the end. And if they change, then it only needs to be done in one place, not all the plugins.

Rick Rick

26 Jun 2010
Posts: 336

I can't quite sure what you guys are planning, it hasn't quite clicked into place in my head... one of those times when I have a completely different idea in my head and it's getting in the way. I could be completely off track here, but please here me out, and feel free to point out flaws in my view or details I'm missing from your ideas.

Here's my idea, in case it's of any help to you guys... When listing a plugin in the pluginClasses via "provides" API, the plugin can accept responsibility to handle sub urls. (would be easiest if we could expand on the array and check before processing to avoid breaking old plugins.


$_provides['pluginClasses'] = array(
'Jojo_Plugin_hambrook_cart_display' => array(
'display' => 'Articles - Article Listing and View',
'suburls' => true
)
);


Or maybe even have it as a checkbox under the "Technical" tab when editing pages. But this would stop things from "just working".

Couple that with changing the page selection query to something similar to my patch for rev 2997 but make it also check that if the url is different to the page url (ie longer) then it checks that the page's linked class has opted to handle sub urls.

If no page is found, a 404 is thrown. If a page is found, it's loaded as per normal and can check the request uri (or be passed the uri) and determine what it should show without having to register a uri and the plugin (Articles in this case) doesn't have to run checking functions for every page loaded. With the other method mentioned above at least one database query is done in checkPrefix() and one in _getPrefix() along with a fair amount of other checking. We could make one core query longer and save at least two queries in every plugin that supports faux urls such as Articles. I know that on my site I also have a Portfolio plugin, so I'd save 4 queries on every page load that wasn't an Article or Portfolio, and 3 on pages that were.

If it's done through the register uri and isArticleUrl functions, then I don't see how those queries can be avoided. But with the method I mentioned above, the checks are only done if we're on the articles page (or a sub page thereof).

Lets pretend that the brackets below simply surround the portion of the url that is actually a page and not a parameter.

At present Jojo also sets the wrong page when using plugins on sub pages. Eg loading "(blogs/tom)" might load the page with the url "(blogs)". This also applies to "(blogs/tom)/an-article". Jojo ends up loading the "(blogs)" page instead of "(blogs/tom)". To combat this the isArticleUrl function and getCorrectUrl would have to check for valid pages as "(blogs/tom/an-article)", "(blogs/tom)/an-article" and "(blogs)/tom/an-article. Duplicating this code inside every plugin that uses faux page urls.
Rick Rick

26 Jun 2010
Posts: 336

Ok my girlfriend just glanced over and read this and thinks my tone was a bit sharp. I'm just trying to be helpful and contribute an extra set of eyes. I might be completely wrong, I'm just trying to help :)
tom

Developer

tom

28 Jun 2010
Posts: 379

if it was sharp i missed it...
I've just committed what i think is a fix for the slashed urls issue. There seemed to be two issues - (or at least i think so..) one was that Jojo wasn't allowing slashes in a plugin url, so kept trying to return the 'parent' page id instead of the correct one (eg the 'blogs' page id when you were trying to get the 'blogs/tom' id). Second was to add a check into isArticleUrl to see if the full uri matches an existing prefix, in which case the function should ignore it and let the normal page handler do what it's supposed to rather than matching it mistakenly to 'blogs/tom' as though 'tom' were a url rather than part of the prefix.

testing would be appreciated. I've run through everything i can think of on sites with one category, several, embedded children etc but there are bound to be edge cases i haven't covered
tom

Developer

tom

6 Jul 2010
Posts: 379

OK, I think that's nearly it for stage one - some odd things happen on converting multi-language/no category existing installs but it's close.. On most sites it should be pretty clean on database calls (providing your indexes are set up as per the install files). If you're using global to call a subset of articles to use in the sidebar it's still much as it was - i haven't figured a way of making that any cleaner - but otherwise it ought to now just work without too much thinking being required.

Next steps are:
to pull the extra code (rss and comments) out into optional plugins,
to adjust the templates to start using the category table options rather than the global ones,
and fix core so that table indexes in the install file are checked in the same way fields are.
and Michael/Mike's ?query handler thing.

For the moment though I want to start applying this logic to gallery3 and a variant of it which will work better for portfolio work (for my own site).

And to finally get around to finishing and commiting the various changes to the profile plugin so that it can also work as more of a directory listing when required (when there are many staff members - a hack version of what i'm thinking of is up at www.bioscienceresearch.co.nz ).

So, if anyone else wants to look at that stuff in the meantime they should feel welcome, but otherwise., i'll get back to it anon.
tom

Developer

tom

18 Jul 2010
Posts: 379

most of the bugs ironed out now and a lot cleaner and more consistent. Haven't got to the list above yet.. (although RSS handling has been centralised). Tested it pretty extensively but others experiences would be useful - anyone game?
searchmaster

19 Jul 2010
Posts: 19

Any chance of getting date of article to be date/time of article. Means that when article submitted and rss created, that the rss to email shows the new article at the correct time, not midnight.
Thanks for all the updates.
tom

Developer

tom

19 Jul 2010
Posts: 379

yes, i've been thinking about that for a while. Can be done easily enough, and would make it more straightforward for date ordering and international sites if everything used unix timestamps.

The only issue is writing the script to convert the existing field over from one to the other, which will mean setup gets forced (again), but considering it's already a requirement with the changes I've made so far, now's probably as good a time as any.

It may also break some custom templates that do their own date formatting.
searchmaster

19 Jul 2010
Posts: 19

Awesome! Thanks. Resetups are always a pain when over many sites, but the changes made make it worthwhile.
tom

Developer

tom

19 Jul 2010
Posts: 379

I think that last commit takes care of the date issue. Seems to respect timezones correctly. Unixdate field will still set the time to midnight if you alter an article to another date but will use a current timestamp if it's left blank.
tom

Developer

tom

19 Jul 2010
Posts: 379

oh an one other thing to watch out for - if you were using $articleshome as a link to the articles page in your template (like for a teaser feed on the homepage) it doesn't exist anymore... use {$articles[0].pageurl} instead (which includes the language prefix and trailing slash)
tom

Developer

tom

20 Jul 2010
Posts: 379

.. the last of the big ones (god willing). Removed all Comments code and added a new plugin to handle it (which can now be used by any plugin that wants to).

It should handle all of the conversion from one to the other relatively seamlessly if you install the comment plugin and setup. (i think).

The comment buttons work but the edit function is still using the frajax iframe which is not ideal. it would be good to have that as an option that could use optional public editor plugins. Likewise Captcha.

If you're using custom articles templates you'll need to adjust them to the format in the default jojo_article.tpl
basically remove anything to do with comments and replace it with
{if $commenthtml}{$commenthtml}{/if}

It seems pretty stable and if there are issues they're most likely only to show up if your site uses comments. If it doesn't, it shouldn't make any difference.
Rick Rick

23 Aug 2010
Posts: 336

Your new rework of the Articles plugin looks great, but it's now missing two things... In the articles list that getArticles returns, there's no access to Category name (previously gotten my joining onto the Pages table), Category URL, or comment count.

I understand comment count would take a lot more queries now that it's been separated if you want to keep full abstraction. But were category details simply forgotten? Does nobody else use these?
Back to Forum Index : Back to Plugin Announcements Page [1] [2] >>  RSS
You must be logged in to post a reply



You need to Register or Log In before posting on these forums.