The best way to process / read these files (HCFA medical information application form)

I am looking for some suggestions on the best approaches to script processing when reading a file in C #; a specific scenario is something that most people will not know if you are not involved in healthcare, so I will give a brief explanation first.

I am working on a health plan and we receive complaints from doctors in several ways (EDI, paper, etc.). The paper form for standard medical claims is the “HCFA” or “CMS 1500” form. Some of our contract doctors use software that allows you to create and save your claims in the HCFA "layout", but in a text file (so you can think of it as a paper form, but without background / boxes, etc.) . I attached an image of a fictitious claims file that shows how it will look.

Claim information is currently extracted from text files and converted to XML. The whole process is working fine, but I would like to make it better and easier to maintain. There is one serious problem that relates to the scenario: each doctor’s office can send these text files to us in several different layouts. Meaning, Doctor A may have the name of the patient on line 10 starting with character 3, while Doctor B may send a file where the name begins on line 11 on character 4, etc. Yes, what we have to do is follow the standard layout, which any doctor who wants to present in this way must follow. However, the management stated that we (the developers) should handle the various possibilities ourselves and that we cannot ask them to do something special, since they want to maintain good relations.

There is currently a “mapping table” set up with one row for each different doctor’s office. The table shows the columns for each field (for example, patient name, participant ID, date of birth, etc.). Each of them receives a value based on the first file that we received from the doctor (we manually set the map). Thus, the PATIENT_NAME column can be defined in the mapping table as "10.3.25", which means that the name starts on line 10 with character 3 and can contain up to 25 characters. It was a painful process, from the point of view of (a) creating a map for each doctor - it is tiring, and (b) maintainability, as they sometimes suddenly change their layout, and then we have to reassign all this for this doctor.

The file is read sequentially, the line is added to

List<string> 

As soon as this is done, we will do the following: we will get the map data and read the list of lines in the file and get the field values ​​(recall that each displayed field is a value similar to "10.3.25" (without quotes)):

 ClaimMap M = ClaimMap.GetMapForDoctor(17); List<HCFA_Claim> ClaimSet = new List<HCFA_Claim>(); foreach (List<string> cl in Claims) //Claims is List<List<string>>, where we have a List<string> for each claim in the text file (it can have more than one, and the file is split up into separate claims earlier in the process) { HCFA_Claim c = new HCFA_Claim(); c.Patient = new Patient(); c.Patient.FullName = cl[Int32.Parse(M.Name.Split(',')[0]) - 1].Substring(Int32.Parse(M.Name.Split(',')[1]) - 1, Int32.Parse(M.Name.Split(',')[2])).Trim(); //...and so on... ClaimSet.Add(c); } 

Sorry it took so long ... but I felt some background / explanation was needed. Are there any better / more creative ways to do something like this?

+4
source share
4 answers

You need to work on the principle of DRY (Do not Repeat Yourself), sharing problems. For example, the code you posted has explicit knowledge:

  • how to parse a claims card and
  • How to use the claims map to analyze your claims list.

Thus, there are at least two responsibilities directly related to this method. I would recommend changing the ClaimMap class to be more representative of what it should actually represent:

 public class ClaimMap { public ClaimMapField Name{get;set;} ... } public class ClaimMapField { public int StartingLine{get;set;} // I would have the parser subtract one when creating this, to make it 0-based. public int StartingCharacter{get;set;} public int MaxLength{get;set;} } 

Please note that ClaimMapField represents in the code that you have spent considerable time explaining in English. This reduces the need for lengthy documentation. Now all calls to M.Name.Split can be combined into one method that knows how to create ClaimMapFields from the source text file. If you ever need to change the way your ClaimMaps elements are represented in a text file, you only need to change one point in the code.

Your code may now look something like this:

 c.Patient.FullName = cl[map.Name.StartingLine].Substring(map.Name.StartingCharacter, map.Name.MaxLength).Trim(); c.Patient.Address = cl[map.Address.StartingLine].Substring(map.Address.StartingCharacter, map.Address.MaxLength).Trim(); ... 

But wait, there still! Every time you see a repetition in your code, it smells like code. Why not extract the method here:

 public string ParseMapField(ClaimMapField field, List<string> claim) { return claim[field.StartingLine].Substring(field.StartingCharacter, field.MaxLength).Trim(); } 

Your code may now look something like this:

 HCFA_Claim c = new HCFA_Claim { Patient = new Patient { FullName = ParseMapField(map.Name, cl), Address = ParseMapField(map.Address, cl), } }; 

Parsing the code into smaller logical fragments, you can see how each part becomes very easy to understand and verify visually. You significantly reduce the risk of copy / paste errors, and when there is an error or a new requirement, you usually have to change only one place in the code instead of each line.

+1
source

Given the lack of standardization, I think that your current solution, although not perfect, may be the best you can do. Given this situation, I would at least isolate the problems, for example. reading a file, parsing files, converting files to standard xml, access to a mapping table, etc. to simple components using obvious patterns, for example. DI, strategies, factories, repositories, etc., Where necessary, to separate the system from the base dependency on the mapping table and existing parsing algorithms.

+2
source

If you get only unstructured text, you need to parse it. If the text content changes, you should correct your parser. There is no way around this. You could probably find a third-party application to do some kind of visual analysis when you select the line of text you want and it will make the whole substring for you, but still unstructured text == parsing == fragile. A visual analyzer would at least make it easier to view errors / modified layouts and fix them.

As for the actual parsing, I'm not sure about the linear approach. What if something you are looking for spans multiple lines? You can put all this on one line and use IndexOf for a substring with different indexes for each piece of data that you are looking for.

You can always use RegEx instead of Substring if you know how to do it.

+1
source

While the basic approach that your choice is appropriate for your situation definitely allows you to clear the code to make it easier to read and maintain. Having separated the functionality that you do as part of the main loop, you can change this:

  c.Patient.FullName = cl[Int32.Parse(M.Name.Split(',')[0]) - 1].Substring(Int32.Parse(M.Name.Split(',')[1]) - 1, Int32.Parse(M.Name.Split(',')[2])).Trim(); 

more or less like this:

 var parser = new FormParser(cl, M); c.PatientFullName = FormParser.GetName(); c.PatientAddress = FormParser.GetAddress(); // etc 

So, in your new FormParser class, you pass a List that represents your form and complaint map to the provider in the constructor. Then you have a recipient for each form property. Inside this getter, you execute your parsing / substring logic, as you are doing now. As I said, you really do not change the method by which you do it, but, of course, it would be easier to read and maintain and can reduce the overall level of stress.

0
source

Source: https://habr.com/ru/post/1391890/


All Articles