Citations data filter for authorcites.txt -- local/stage2/authorcitesfilter


Let Jose look into iscited.txt and fix nonsense paper titles appearing
instead of handles at the top and the bottom.

1. run 

   perl -Ilib bin/recover_citation_events.pl -a --clean

to remove citations that are no longer found in the citations table

3. make it run a citations search by documents for every account


Move test.acis to the sahure.openlib.org machine (I should now be
allowed to root@sahure.openlib.org).

On 2007-04-25 there were mysterious problems for some users with
$screen_name undefined in handle_request() (Web::App).  Also, with no
last name in initial screen processing and other weird things.  See my
error-log report From: RAS/ACIS <aras@authors.repec.org> To:
kurmanka@yandex.ru Date: Wed, 25 Apr 2007 23:57:02 -0400 Subject: RAS
error log 2007/04/25 in RAS-ACIS mailbox. 
-- found no explanation and given up (2007-05-08 15:36)




RAS: Christian suggests wording for the confirmation screen:

   Please check your mailbox, where a confirmation message should
   arrive shortly. Follow the instructions detailed in this
   message. Your registration will not be valid until this last step is
   completed. If you did not get the message:

   1) check that the email address is correct (see above);
   2) messages may sometimes take a while to clear some hurdles (firewalls,
      spam checks), so wait a little more;
   3) check your spam or junk-mail folder;
   4) alert the RePEc Author Service admin (email at foot of page).

(Sat, 27 Jan 2007 17:15:39 -0500 (EST))


ACIS is disk-space hungry; it is a disk space HOG.  

  - limit RI history tracking to only a number of latest events?

     - how does berkeley db manage space fragmentation in the
       database?
 
  - change RI history object to a much simplier array structure
    (instead of a hash)

Go through all XXX marks and either fix or re-label them as "ZZZ" or
"maybe".

Does main.conf.eg needs a revision? Add comments?


Things would be nice to have:

 - fastcgi support

 - check APU queue building: does it really sort by time?

 - a database of the configuration parameters, database tables, 
   sysvars/sysflags, sysprof param names, etc.  I don't know how.  Via
   ACIS itself?

 - APU auto-throttling?

 - improve registration confirmation system to fix our response on
   already confirmed URLs


CHECK: Name-English should never be left empty! (Now it can be if a
user edits his current latin name, but what he enters is invalid.)

CHECK: On the profile page, do not list items which have no sid, or
empty title (e.g. RePEc:rus:* items)

  - also: do not list editors, if the editors element is
    empty

CHECK: How do I treat research profile items (/contributions/accepted)
which have no record in the resources table?  I should remove them!



Documentation:

-  update data backup instructions (section What to backup and how to restore)

-  Document special markup that can be used in the <a
   ref='phrases'>phrases</a> and <a ref='site'>site/ files</a>.

-  Short ids database and maintenance




###  ACIS community building 

 - prepare acis.openlib.org

   - publish textilshchiki paper

   - put a large banner there, that we seek like-minded people, and
     volunteers

   - review site navigation

 - RAS

   - maybe ask Christian to put an announcement about it into his monthly
     mailing

   - put an announcement on RAS

   - Maybe make little "Did you know?" ads on RAS with links
     to little details about ACIS.  (Or may be ACIS would be
     just one of these.  Other details may be about RePEc,
     EDIRC and such.)

 - Maybe we should make a nice button "Powered by ACIS"
   ("Powered by ACIS software"? "Runs on ACIS software"?)
   with a link to acis.openlib.org.




short-id assignment: two simultaneous requests should not cause a
deadlock.  but how?


Documentation to fix:

  - "RePEc" -- eliminate

  - configuration:

    - umask parameter

    - requests-log parameter

    - echo-arpu-mails parameter

  - add new files to the filelist


Web-App

  $app -> handle_request( $request );

  $app -> handle_cgi_request();


Understand and fix "Not satisfied with search results"
problem on research/autosearch in Opera 7.23

RePEc-Index: 

  series data shall not be processed unless the series
  template exists and the series shall not be processed
  unless the archive template exists

Validating email addresses:
   giuseppe.carone.@cec.eu.int (? maybe this one is OK)
   cat@asb..dk


Forgotten password: when there is no such account in the
system we still shall show what exactly did user enter to
see this message.


It looks like Encode module's MIME-Q implementation can
sometimes split an email address at position 79 just for the
paragraph right margin's sake.


It turns out many people remove items from their profile
thinking it removes them from the database or from
EconWPA. We should put something to this effect, like
"removing items from your profile does not erase them from
the database". (Christian)


Christian suggests that when there is an error loging in,
the script suggests the following:

  "This email address does not correspond to an existing
  account in the RePEc Author Service. Check its
  spelling. If you have changed your email address, please
  enter your old address, and then modify your settings. To
  create your account, go here(hyperlink)."


Automatic research profile search and name variations:

  Christian's suggestion that I have to convert accents and
  dashes in names before storing them to database and before
  searching the database.

  I discussed with Ivan various issues regarding the performance of the
  search engine for the authors, and I am not sure this was echoed here.

  The first thing I noticed, and Ivan dealt with it already,
  is the treatment of "." with initials. In the data, they
  sometimes appears, sometimes they do not. Ivan has not
  uniformized that. However there are still other issues
  that appear as I watch people using the system:

  - accents: the data sometimes has them, some not, and
    sometimes partially (in the first name, but not in the
    last name). We cannot expect authors to think of all the
    name variations. This should be done like on EconPapers
    where everything is converted to non-accented letters
    for search purposes.

  - hyphens: Jean-Marc cannot find works listed as authored
    by Jean Marc.

  - suffixes not preceded by a comma: Lucas, Jr. does not
    find Lucas Jr.

  and we will probably discover others. All this can be
  solved by applying some regexp like rules to that names
  searched on. We cannot and should expect of the authors to
  be able to dream up all the silly things we may have in
  the database.




process-on-POST screen configuration parameter

/adm/search: replace records search button on the user
search screen with the <a>records</a> link


Deceased people's profiles (discussion on RePEc-run around
2003-10-17)


Remove error message when the session has expired, but is
not neccessary for this request.

make new-user session lifetime a configuration parameter



typography in generated HTML: apostrophe

make the "required" word on all forms -- in red color text ??

A user deletes his record.  Do we clear ALL his data from
the databases?  What about the suggestions table?  What
about the sysprof table?

Background search fails and leaves a thread record.  The
thread record blocks the new user in the contributions
screen.

  - monitor threads table?
  - remove outdated items?

Check for XXX in the sources and resolve/clear them.



WISHES:

  - confirmation handling.  Users have to click the confirmation URL and then
    enter their login & password to enter the account.  Many services work
    this way.  But why wouldn't users immediately get into their new profile
    after clicking the confirmation URL?

  - new users can't change a typo in their email address, if they did it.  And
    they will only realize it by the end of registration process.



The Settings screen:

  - move cookie options to a group of its own

  - cookie clear button




QUESTIONS: 

- Advertise welcome page URL to users?  Suggest them
  bookmarking it?



EVENTS LOGGING

- log request IPs

may be:

- separate each event details into main and secondary, show
  secondary on request



ADMINISTERING

- editing a userdata

- upgrading a user



SERVICE ANNOUNCEMENTS, A FEATURE REQUESTED BY CHRISTIAN

- a database table

- a presentation strategy

- an presentation implementation

- submit interface



ADVANCED USER:

- advanced user's "delete record" function.  Shall only
  delete a particular record.

- "new person record" screen




GENERAL THINGS:

- a way to see acis.users table ?  -- what for?

- make system software version visible for the end-users

- compile thread-safe mysql client libraries on nebka
  (-with-thread-safe-client configuration option)

- install mysql 4.1 on acis.openlib.org 


- identifiers search - automatically select phrase search
  (on the page, HTML/JavaScript)

- an idea for additional research items search (???):
  
  -- decompose the name variations into a set of words
  -- search mysql by the last name
  -- scan the results for word matches with the variations
     words
  -- sort in accordance with matches




RePEc-Index:

- write a log file of new/changed/disappeared files and
  records




WOULD BE NICE TO HAVE, BUT NOT ESSENTIAL:

- make BCC-ing admin on the confirmation mails a
  configurable option

- /research/claim screen which could be used to add a
  specific item (from Ideas, e.g.) to the research profile,
  one-click.


- accessibility

  -- make a link to skip navigation ?

- New-user registration.  Would be nice to have "Abandon"
  link near to the "Continue!" link on the INDEX page.

- Help section 
  - privacy policy
  - cookie-based login and welcome URL



NOT URGENT, BUT IMPORTANT

- confirm by email

- RePEc database




OTHER ISSUES, NOT URGENT AND QUESTIONABLE.

- Data::DumpXML: dumping zero character should not be, or should be
  escaped ?

- ACIS user records identification strategy

- store submited institutions as individual records in
  userdata:/data/records and process them accordingly in userdata
  processing: store them in ARDB, in institutions table... ???

- move account deletion to log off stage?

- form checking and processing: partial processing when some of the
  parameters need not to be asked again, like the password on the
  initial page ??

- bin/setup script should look at a file before writing to it and only
  write it if admin hasn't edited it himself ?

- RePEc-Index: wrap custom-processing into eval with logging the
  problems in it



OTHER ITEMS OR IDEAS OR PROBLEMS

- new-user registration account locking: if a user started
  to register, we shall lock the account AND if someone
  starts to register for the same email address, shall give
  an option to continue the registration?  -- I don't know.
  Too difficult, too complex.


- RePEc::Index: Notice when userdata is edited by admin, not
  by the owner.  How?

    Yes, look at the last-login-date.  If it is near, then it is
    probably the user.  If it is not, then it means nothing.
    RI-processing of the userdata file might have been deferred.
    Another similar approach is to look at date difference between
    last-login-date and the timestamp of the file's last
    modification.  Normally, when user edits the file, it will be
    saved within seconds of the last-login-date.

    Also, I need: offline person profile update, based on userdata
    changes.  Oh, yeah, that I have already, basically.


- should I not make ACIS::Web::Person a class for person records?

- name/last-change-date is based on: last, latin, and full names only?

  \-> either fix the documentation (id: name-variations) or the code
  (Person.pm)
