- Notifications
You must be signed in to change notification settings - Fork 196
String list new #311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String list new #311
Conversation
The API for the module to manipulate lists of strings has been discussed and this has resulted in the current implementation
There was a typo in some comments - corrected
Thank you for pushing this PR forward. Note #320 by @awvwgk that proposes a non-fancy stdlib should provide a uniform and consistent model of strings, whatever that model may look like. So if there's a It seems to me that the |
@epagone wrote:
I must admit, when doing a preliminary review of this pull request I was confused by the |
Not @arjenmarkus, but the same train of thought emerged from discussion with @epagone in #268 (comment). I guess it just depends which PR (#311 or #320) reaches maturity first. As long as the behavior of the |
I am not sure I understand this. The two objects are intended to be convenient offsets into the lists and as such are part of the public API and you can use them for any list. They translate to integer indices specific for the list in question. @epagone, can you elaborate? How would you encapsulate list_end with the derived type? I thought I solved your critique by not using special integer values but instead independent derived types (actually there are only two objects of that type, as the type itself is private). Perhaps I misunderstood what you suggested. Op ma 15 feb. 2021 om 11:58 schreef Ivan Pribec <notifications@github.com>: … @epagone <https://github.com/epagone> wrote: For example, I did raise a related issue <#269 (comment)> while reviewing the string list PR (together with other observations) where the namespace was polluted with a list_end variable and I suggested to encapsulated it within the derived type. Well, although we got to string list "new" <#311>, well, let's say that my comment did not get much traction <https://github.com/fortran-lang/stdlib/blob/90b06ff89ac3dce00e6c1d2f770beca38b837e34/src/stdlib_stringlist.f90#L26> smiley I must admit, when doing a preliminary review of this pull request I was confused by the list_head and list_end derived-type instances. Are these two objects part of the public API and are they supposed to be reused for multiple lists simultaneously? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#311 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAN6YR4J7N55PHCPQ5XZLT3S7D45JANCNFSM4XFADDSQ> . |
The behaviour of stringlist_type should be independent of the actual data type - as long as the required operations are supported. I do have an API question, however: do we want to mix ordinary strings and the string_type or do we require strings to be presented as string_types only? In other words: Is this allowed (with cnversion to string_type under the hood): call list%insert( 1, "A new string" ) Or do we require: call list%insert( 1, string_type("A new string") ) Op ma 15 feb. 2021 om 12:10 schreef Ivan Pribec <notifications@github.com>: … stdlib should provide a uniform and consistent model of strings, whatever that model may look like. So if there's a string_type that provides the string essentials, stringlist_type should be a list of string_type instances. Do you agree? Not @arjenmarkus <https://github.com/arjenmarkus>, but the same train of thought emerged from discussion with @epagone <https://github.com/epagone> in #268 (comment) <#268 (comment)> . I guess it just depends which PR (#311 <#311> or #320 <#320>) reaches maturity first. As long as the behavior of the stringlist_type is not coupled tightly to the string objects in the list, I don't see any problems. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#311 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAN6YRYRJVGFK7GS3Y2VFETS7D6K3ANCNFSM4XFADDSQ> . |
Thanks @arjenmarkus for the explanation, it is consistent with my understanding of the desired behavior. I am impressed by the smart design, that a single
Does the |
My gut feeling is that if the non-fancy string type in #320 is accepted we should allow both. This could imply using the
Perhaps the more interesting question is whether |
Preferably we return a fixed length character whenever possible, |
@arjenmarkus I very much appreciate that you considered my previous comments (and I think that you also understood my comments correctly). I think that your design to make a "generic" Ideally, I would like to use this feature of
PS: I still cannot bite my tongue and avoid saying that I do not like that we can create "holes" in the list with acceptable "out-of-bounds" indices (unless I have misunderstood your design again 😅 ). |
FWIW, I'd prefer the first option. |
I think the idea of the proposal is that an "out-of-bounds" access is simply translated to an empty string. |
:), no, the current design silently interprets an out-of-bounds index to be at either end, so no holes anymore. Op ma 15 feb. 2021 om 13:37 schreef Emanuele Pagone < notifications@github.com>: … I am not sure I understand this. The two objects are intended to be convenient offsets into the lists and as such are part of the public API and you can use them for any list. They translate to integer indices specific for the list in question. @epagone <https://github.com/epagone>, can you elaborate? How would you encapsulate list_end with the derived type? I thought I solved your critique by not using special integer values but instead independent derived types (actually there are only two objects of that type, as the type itself is private). Perhaps I misunderstood what you suggested. @arjenmarkus <https://github.com/arjenmarkus> I very much appreciate that you considered my previous comments (and I think that you also understood my comments correctly). I think that your design to make a "generic" list_end derived type might be too clever for some less-CS-literate Fortraneers like me: I did not understand that it was meant to be valid for any list. It might also confuse some users that might be wondering where list_start or list_head is (the latter is private instead, IIUC). Ideally, I would like to use this feature of stdlib according to two alternative scenarios. 1. This module will become a dependency (or an extension?) of a generic list module; 2. list_end is an integer encapsulated in type(stringlist_type). PS: I still cannot bite my tongue and avoid saying that I do not like that we can create "holes" in the list with acceptable "out-of-bounds" indices (unless I have misunderstood your design again 😅 ). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#311 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAN6YR6OYZBUEX7MUHT4UTTS7EIQBANCNFSM4XFADDSQ> . |
Accessing an out-of-bounds element is indeed interpreted as an empty string, but you cannot insert a string in, say, position 100, if there are only 3 strings filled. It would become string no. 4. Op ma 15 feb. 2021 om 13:42 schreef Ivan Pribec <notifications@github.com>: … PS: I still cannot bite my tongue and avoid saying that I do not like that we can create "holes" in the list with acceptable "out-of-bounds" indices (unless I have misunderstood your design again sweat_smile ). I think the idea of the proposal is that an "out-of-bounds" access is simply translated to an empty string. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#311 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAN6YR26EVJIDQNZ5QGSX4TS7EJB5ANCNFSM4XFADDSQ> . |
Are we sure that we want to be limited by GCC 8 bugs? I understand it's not a super-old release, but considering that |
This means the current implementation allows to append at the end of a string list by: call list%insert(huge(0), "value") |
Understood, thanks. I like this solution better than the previous one, but I still think that this opens the door to hard to find bugs when we do not enforce a strict check on bounds. |
According to this answer (what is the biggest array size for double precision in Fortran 90?):
The actual array size seems to be limited by the fact, the size of the array in bytes must fit in a Of course the amount of available RAM is the second limitation which enters into consideration, particularly when the |
I just want to note that using A cyclic indexing system could address this problem
|
Should the (Alternatively one could pursue an |
Thank you. IMO this is the best option for its flexibility. |
Would this match the Python semantics as mentioned in #268 (comment) by @certik? |
I made a mistake in the comment above, sorry about that, ignore the list there, I made some clearer tables below. Python is using a zero based indexing in the forward direction and one based indexing in the backwards direction. This approach does not allow to access beyond the first item with the forward index and also you cannot access after the last element with the backwards index:
For Fortran we would use a one based forward index and maybe also a one based backwards index, but we will have one index leftover in this approach, which is zero:
The cyclic indexing I wanted to propose was using all possible indices with:
The one based indexing for both forward and backwards direction looks more intuitive, the only question is what do we do with the zero in this approach? |
I greatly prefer to use such an optional argument above the use of a function. Op ma 15 feb. 2021 om 17:45 schreef Emanuele Pagone < notifications@github.com>: … Should the insert and append provide an optional stat argument to query for allocation errors? Thank you. IMO this is the best option for its flexibility. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#311 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAN6YR22VGAY5COCVMPP2SLS7FFQ5ANCNFSM4XFADDSQ> . |
Hm, my first implementation did something along these lines, but it led to ambiguities. Hence the introduction of a separate type to represent the start and stop indices (and a simple arithmetic) Op ma 15 feb. 2021 om 20:19 schreef Ivan Pribec <notifications@github.com>: … A cyclic indexing system could address this problem - 0, -len(list)-1: before the first element of the list - 1, -len(list): first element of the list - len(list), -2: last element of the list - len(list)+1, -1: after the last element of the list Does this match with the Python semantics as mentioned in #268 (comment) <#268 (comment)> by @certik <https://github.com/certik>? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#311 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAN6YR35OJIDYECH4XXHSPTS7FXV7ANCNFSM4XFADDSQ> . |
Could type(stringlist_type) :: list associate(end => list%end()) ! list_end points to a private stringlist_index_t instance ! append to end of the list call list%insert(end,"one") call list%insert(end,"two") call list%insert(end,"three") ! prepend using an integer position index (in this case head) call list%insert(0,"zero") ! Print values "one", "two", "three" do i = 2, end%as_int() ! ideally, this would be a protected member component, i.e. `end%idx` print *, list%get(i) end do end associate Edit: I have devised an integer list as proof of concept. It appears to work correctly with Intel Fortran, but with gfortran (gcc version 10.1.0), the associate construct does not work correctly. The code can be found at: https://gist.github.com/ivan-pi/e2181b4335a1d93cbeb3d422c6732cfa Returned results
|
call append_stringlist%insert( list_after_end, slist ) | ||
end function append_stringlist | ||
| ||
function append_stringarray( list, sarray ) |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
With the |
Sure, it would be a straightforward replacement of the type that is now used internally. Op za 13 mrt. 2021 om 13:00 schreef Sebastian Ehlert < ***@***.***>: … With the string_type merged, can this patch make use of the newly available functionality? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#311 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAN6YRY3C6WYFUXJLVCR42DTDNHWNANCNFSM4XFADDSQ> . |
@aman-godara, could you elaborate? I am not quite seeing what you mean ;). Op do 11 mrt. 2021 om 15:49 schreef Aman Godara ***@***.***>: … ***@***.**** commented on this pull request. ------------------------------ In src/stdlib_stringlist.f90 <#311 (comment)>: > + type(stringlist_type) :: prepend_string + + prepend_string = list + call prepend_string%insert( list_head, string ) +end function prepend_string + +function append_stringlist( slist, list ) + type(stringlist_type), intent(in) :: list + type(stringlist_type), intent(in) :: slist + type(stringlist_type) :: append_stringlist + + append_stringlist = list + call append_stringlist%insert( list_after_end, slist ) +end function append_stringlist + +function append_stringarray( list, sarray ) in my opinion it will be better to have arguments as ( sarray , list ) so that it becomes coherent with the previous functions convention (append_stringlist, prepend_string, etc.). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#311 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAN6YR4XZO776ACL2K76KF3TDDJ6JANCNFSM4XFADDSQ> . |
This comment has been minimized.
This comment has been minimized.
Yes, that is right. That is a general feature with functions that return values, not pointers. Op wo 31 mrt. 2021 om 12:48 schreef Aman Godara ***@***.***>: … ***@***.**** commented on this pull request. ------------------------------ In src/stdlib_stringlist.f90 <#311 (comment)>: > + string_greater = string1%value > string2%value +end function string_greater + +elemental logical function string_equal( string1, string2 ) + type(string_type), intent(in) :: string1 + type(string_type), intent(in) :: string2 + + string_equal = string1%value == string2%value +end function string_equal + +function append_string( list, string ) + type(stringlist_type), intent(in) :: list + character(len=*), intent(in) :: string + type(stringlist_type) :: append_string + + append_string = list I have a query in this line: I think this will create a copy (deep copy) of list which will be assigned to append_string here. Have I got it right? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#311 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAN6YR4PGSRISRY3CJUXKRLTGL4XRANCNFSM4XFADDSQ> . |
In Fortran the result of a function is contained in a variable with the name of that function, unless you use the clause "result(varname)". So you do not return the value as with C-like languages: myResult = ...; return myResult; but instead: myResult = ... return ! If you really want to return explicitly |
Oops, I should have made sure "myResult" is the result variable ;). So: real function myResult( ... ) ... declare arguments myResult = ... end function myResult Or: function intriguingCalculation( ... ) result(myResult) ... declare arguments real :: myResult myResult = ... end function intriguingCalculation end |
What is the current status of this patch? I marked it as waiting for changes, but maybe this is incorrect and we should rather look for more reviewers. @arjenmarkus, what do you think? |
It has fallen between the cracks :(. I will have to look at it again, especially the possibility to use string_type in stead of the builtin character string. Thanks for reminding me! Op ma 31 mei 2021 om 10:32 schreef Sebastian Ehlert < ***@***.***>: … What is the current status of this patch? I marked it as waiting for changes, but maybe this is incorrect and we should rather look for more reviewers. @arjenmarkus <https://github.com/arjenmarkus>, what do you think? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#311 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAN6YR54PBOTVQNEOIKZIM3TQNCSBANCNFSM4XFADDSQ> . |
First step to pick it up again was to use stdlib's string_type module instead of my home-grown version. Mostly works now, but I discovered a few oddities in the test programs. I must have overlooked them before - will have to use stdlib_error's check to aoid this in future :). Hopefully it is indeed just the test programs that are wrong. Op ma 31 mei 2021 om 16:46 schreef Arjen Markus ***@***.***>: … It has fallen between the cracks :(. I will have to look at it again, especially the possibility to use string_type in stead of the builtin character string. Thanks for reminding me! Op ma 31 mei 2021 om 10:32 schreef Sebastian Ehlert < ***@***.***>: > What is the current status of this patch? I marked it as waiting for > changes, but maybe this is incorrect and we should rather look for more > reviewers. @arjenmarkus <https://github.com/arjenmarkus>, what do you > think? > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#311 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAN6YR54PBOTVQNEOIKZIM3TQNCSBANCNFSM4XFADDSQ> > . > |
This is a bit of confusing. We have |
This might help: #399 (comment). |
else | ||
do i = idxn, list%size | ||
inew = i + number | ||
call move_alloc( list%string(i)%value, list%string(inew)%value ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move_alloc(FROM, TO)
@arjenmarkus if I have understood it correctly what happens if there is already a string at index inew
(TO
)?
If it gets overwritten we will lose what was stored at index inew
.
for example:
list = ["1", "2", "3", "4", "5", "6", "7", "8"]
and something like this executes:
do i = 3, 8 inew = i + 3 call move_alloc(list%string(i)%value , list%string(inew)%value)
Then I will get
["1", "2", _, _, _, _, _, _, "3", "4", "5"]
_
represents that it is deallocated.
Closed in favor of #470 |
No description provided.